Hai Liang

COMM6320: Digital Research

Text Books:

Reference Books:



1. Overview [Slides]

    Digital research = digital data + computational methods


  • Lazer, D., Pentland, A. S., Adamic, L., Aral, S., Barabasi, A. L., Brewer, D., … & Jebara, T. (2009). Life in the network: the coming age of computational social science. Science, 323(5915), 721-723.
  • Golder, S. A. & Macy, M. W. (2014). Digital footprints: Opportunities and challenges for online social research. Annual Review of Sociology, 40(1), 129.
  • Ruths, D. & Pfeffer, J. (2014). Social media for large studies of behavior. Science, 346(6213), 1063-1064.
  • Lazer, D. & and Radford, J. (2017). Data ex machina: Introduction to big data. Annual Review of Sociology, 43, 19-39.
  • Kabacoff (2011) & Wickham (2010).

2. Research Design [Slides]

    Big data, experiment, survey, crowdsourcing, ethics (Salganik, 2017)


  • Salganik, Dodds, & Watts (2006). Experimental study of inequality and unpredictability in an artificial cultural market. Science, 311, 854-856. 
  • Bond, R. M., Fariss, C. J., Jones, J. J., Kramer, A. D., Marlow, C., Settle, J. E., & Fowler, J. H. (2012). A 61-million-person experiment in social influence and political mobilization. Nature, 489(7415), 295-298.
  • Muchnik, L., Aral, S., & Taylor, S. J. (2013). Social influence bias: A randomized experiment. Science, 341(9), 647-650.
  • Kramer, A. D., Guillory, J. E., & Hancock, J. T. (2014). Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences, 111(24), 8788-8790.
  • King, G., Pan, J., & Roberts, M. E. (2014). Reverse-engineering censorship in China: Randomized experimentation and participant observation. Science, 345(6199), 1251722.
  • Munger, K. (2016). Tweetment effects on the tweeted: Experimentally reducing racist harassment. Political Behavior, 39(3), 1-21.


  • Berinsky, A. J., Huber, G. A., & Lenz, G. S. (2012). Evaluating online labor markets for experimental research: Amazon.com’s Mechanical Turk. Political Analysis, 20(3), 351-368.
  • Wang, W., Rothschild, D., Goel, S., & Gelman, A. (2015). Forecasting elections with non-representative polls. International Journal of Forecasting, 31(3), 980-991.
  • Mellon, J. & Prosser, C. (2017). Twitter and Facebook are not representative of the general population: Political attitudes and demographics of British social media users. Research and Politics, 1-9.

3. Web Data Collection [Slides]

    API, screen scraping, special techniques (e.g., selenium, apps) 

  • Munzert et al. (2015)
  • Liang, H., & Fu, K. W. (2015). Testing propositions derived from Twitter studies: Generalization and replication in computational social science. PLoS ONE, 10(8), e0134270. 
  • Liang, H., & Zhu, J. J. H. (2017). Big data, collection of (social media, harvesting). In Jörg Matthes (Ed.), The International Encyclopedia of Communication Research Methods. Wiley Press.

4. Text Mining [Slides]

    Basic preprocessing, vector space model, supervised, unsupervised learning (Kumar, 2016; Silge & Robinson, 2017)

    Basics & Counting:

  • Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political analysis, 21(3), 267-297.
  • Lansdall-Welfare, T., Sudhahar, S., Thompson, J., Lewis, J., Team, F. N., & Cristianini, N. (2017). Content analysis of 150 years of British periodicals. Proceedings of the National Academy of Sciences, 201606380.
  • Benoit, K. et al. Getting Started with quanteda.
  • Bail, C. A. (2016). Combining natural language processing and network analysis to examine how advocacy organizations stimulate conversation on social media. Proceedings of the National Academy of Sciences, 113(42), 11823-11828.
  • Liang, H., & Fu, K. W. (2017). Information similarity, overload, and redundancy: Unsubscribing information sources on Twitter. Journal of Computer-Mediated Communication, 22(1), 1–17.
  • Liang, H., & Fu, K. W. (2016). Network redundancy and information diffusion: The impacts of information redundancy, similarity, and tie strength. Communication Research.

     Supervised Learning:

  • Beauchamp, N. (2017). Predicting and interpolating state‐level polls using Twitter textual data. American Journal of Political Science, 61(2), 490-503.
  • Theocharis, Y., Barberá, P., Fazekas, Z., Popa, S. A. and Parnet, O. (2016), A bad workman blames his tweets: The consequences of citizens’ uncivil Twitter use when interacting with party candidates. Journal of Communication, 66: 1007–1031.

     Unsupervised Learning:

  • Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77-84.
  • Roberts, M. E., Stewart, B. M., Tingley, D., Lucas, C., Leder‐Luis, J., Gadarian, S. K., … & Rand, D. G. (2014). Structural Topic Models for Open‐Ended Survey Responses. American Journal of Political Science, 58(4), 1064-1082.
  • Lucas, C., Nielsen, R. A., Roberts, M. E., Stewart, B. M., Storer, A., & Tingley, D. (2015). Computer-assisted text analysis for comparative politics. Political Analysis, 23(2), 254-277.

5. Social Network Analysis [Slides]

    Basics, network formation, network influence, information diffusion (Kolaczyk & Csardi, 2014)


  • Himelboim, I. (2017). Social Network Analysis (Social Media). The International Encyclopedia of Communication Research Methods. 1–15.

    Social Selection & Influence: 

  • McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual review of sociology, 27(1), 415-444.
  • Desmarais, B. A., & Cranmer, S. J. (2017). Statistical inference in political networks research. The Oxford Handbook of Political Networks, 203.
  • Wimmer, A., & Lewis, K. (2010). Beyond and below racial homophily: ERG models of a friendship network documented on Facebook. American Journal of Sociology, 116(2), 583-642.
  • Aral, S., Muchnik, L., & Sundararajan, A. (2009). Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks. Proceedings of the National Academy of Sciences, 106(51), 21544-21549.
  • Lewis, K., Gonzalez, M., & Kaufman, J. (2012). Social selection and peer influence in an online social network. Proceedings of the National Academy of Sciences, 109(1), 68-72.
  • Liang, H. (2014). The organizational principles of online political discussion: A relational event stream model for analysis of web forum deliberation. Human Communication Research, 40(4), 483-507. 
  • Liang, H. (2014). Coevolution of political discussion and common ground in web discussion forum. Social Science Computer Review, 32(2), 155-169. 

     Information Diffusion:

  • Del Vicario, M., Bessi, A., Zollo, F., Petroni, F., Scala, A., Caldarelli, G., ... & Quattrociocchi, W. (2016). The spreading of misinformation online. Proceedings of the National Academy of Sciences, 113(3), 554-559.
  • Ugander, J., Backstrom, L., Marlow, C., & Kleinberg, J. (2012). Structural diversity in social contagion. Proceedings of the National Academy of Sciences, 109(16), 5962-5966.
  • Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146-1151. 
  • Lehmann, S. & Ahn, Y. Y. (2017). Spreading dynamics in Social Systems.