top of page

COVID-19 Communication Analytics by Africa CDC

This project analyzes how Africa CDC used Facebook and Instagram to communicate COVID-19 information across Africa. By collecting and examining 224 posts using data scraping tools and analyzing engagement patterns in R Studio, the study evaluates the effectiveness of multimedia content, messaging strategies, and audience interaction. The insights offer valuable recommendations for enhancing public health communication through social media.




By Mathew Shem

Biostatistician | Data Scientist | Founder, StatQuestJourney Hub

📌 Introduction

In the heat of the COVID-19 pandemic, effective communication was not just a need—it was a lifeline. As the African continent battled misinformation, poor health infrastructure, and a rising infection curve, Africa CDC turned to social media—Facebook and Instagram—to disseminate vital health messages.

This project—from scratch to insights—analyzes 224 Africa CDC social media posts to evaluate engagement, multimedia strategy, and public response. I collected and cleaned the data, performed in-depth analysis using R Studio, and interpreted the results in the light of public health communication.

👇 Here’s a complete walkthrough of the process and insights.

🗃️ 1. Data Collection

📌 Sources:

🔧 Tools Used:

  • BeautifulSoup4 (Python) – for parsing HTML data.

  • Instaloader (Python) – for scraping Instagram posts.

  • ParseHub (GUI) – for collecting Facebook posts with timestamps, likes, comments, and shares.

🧪 Fields Collected:

  • Post text, Date, Likes, Comments, Shares, Hashtags (#COVID19)



🧹 2. Data Cleaning & Integration

After extraction, I encountered:

  • Missing shares values on Instagram (platform limitation)

  • Some engagement fields missing from Facebook

Steps Taken:

  • Removed duplicates

  • Handled nulls using complete case analysis

  • Merged datasets into a single data frame

📌 Cleaned dataset included only #COVID19-tagged posts between 2020–2022.

🔍 3. Data Exploration (EDA)

Using ggplot2, dplyr, and lubridate, I performed initial exploration to understand:

  • Posting trends over time

  • Engagement distribution across platforms

  • Type of multimedia and framing

Key Metrics:

  • Engagement Score = Likes + Comments + Shares

  • Engagement over time using line plots

Findings:

  • Facebook had consistently higher engagement

  • Engagement spiked during major health announcements (e.g., PACT initiative)

  • Instagram posts received more likes, but fewer comments/shares

🧠 4. Content Analysis

Categorization based on:

  • Visual Type: Infographics, Motivational images, Instructional videos

  • Message Type: Protective, Scientific, Emotional

Coding Framework:

Developed a structured Codebook to tag:

  • Multimedia presence

  • Crisis framing

  • Influencer use

  • Emotional tone

🔬 5. Statistical Analysis & Results

🧪 Inter-Rater Reliability

  • Coded posts with help of second coder

  • Fleiss’ Kappa = 0.89 → Excellent agreement

📊 Comparative Analysis

  • Facebook engagement: Avg. 120 likes, 45 comments, 15 shares

  • Instagram: Avg. 40 likes, 0.6 comments

📈 Engagement Trends Over Time

  • Peaks during campaigns, announcements

  • Posts with infographics and videos had 40–70% higher interaction

🖼️ (Insert Screenshot: Line chart of likes/comments over time)

📢 6. Research Questions Answered

✅ Q1: How did Africa CDC use multimedia on Facebook and Instagram?

  • Used infographics and videos (Facebook), text-heavy posts on Instagram

  • No posts featured influencers

✅ Q2: Differences in use of multimedia across platforms?

  • Facebook: More interactive, used visuals better

  • Instagram: Visually appealing but less engaging due to lack of text detail

✅ Q3: Effectiveness of multimedia?

  • Infographics boosted engagement by ~40%

  • Crisis posts on Facebook had 25% more shares

🖼️ (Insert Screenshot: Faceted plots of media type vs engagement)

💡 7. Interpretation & Insights


Key Insight: Public health agencies need platform-specific strategies—visual storytelling for Instagram, interactive informative content for Facebook.


🚧 8. Limitations

  • Instagram doesn’t support shares → skewed metrics

  • Focused only on two platforms

  • Multimedia content (images, videos) was not extracted due to scraping limits

🔮 9. Recommendations

  • Use influencers to boost engagement (predicted 200% increase in likes)

  • Add multimedia to Instagram (engagement could rise by 70%)

  • Conduct sentiment analysis in future for deeper audience understanding

  • Include platforms like X (Twitter) and YouTube for broader insight

📚 Tools & Libraries Used

  • Python: beautifulsoup4, instaloader, pandas

  • R: tidyverse, ggplot2, dplyr, lubridate

  • Scraping: ParseHub (Facebook GUI extraction)

  • Statistical Testing: Fleiss’ Kappa, Engagement ratio calculation

  • Documentation: RMarkdown, PowerPoint for presentation

🎤 About the Author

I am a passionate Biostatistician and Data Scientist, founder of StatQuestJourney Hub, and currently presenting this research at the DataSpeaks event. My journey from raw data scraping to detailed insights reflects my mission: using data to tell stories that matter.

📨 Contact: mathewshem90@gmail.com🔗 Follow the journey: [LinkedIn - Mathew Shem]

Power in Numbers

30

Programs

50

Locations

200

Volunteers

Project Gallery

SQJH LOGO.png

Data Unlocked: Learn. Explore. Transform.

Phone

+254 702 304599

+254 768 944928

Email

Location

Nairobi, Kenya

StatQuestJourney Hub is a global online learning platform founded in 2022. Our mission is to empower students and professionals by providing top-tier education in data science, statistics, and programming, making you the driving force behind your career growth and success.

  • TikTok
  • Telegram
  • Facebook
  • X
  • LinkedIn
  • Instagram
  • Youtube

Get latest news delivered directly to your inbox

Thanks for submitting!

©2024 by StatQuestJourney Hub

bottom of page