Monday, March 3, 2008

content analysis and intercoder reliability

Content analysis is a major subfield of communication research. The world-famous agenda-setting study has content analysis as one of its two important components. Understanding what's in the media, after all, is the precursor of understanding what effect media has.

There are a variety of ways for investigating media content. Qualitative researchers employ "textual analysis," whereas quantitative researchers use "content analysis." They are similar methods except content analysis is touting for its "systematic and scientific" approach to analyzing media content.

Content analysis often involves the coding process, when researchers pre-define a series of variables of interests and tally how frequently these variables appear in news articles. This coding process, therefore, brings about the question of reliability. That is, if other researchers are going to do the same thing, can they get similar results using the same coding schemes? This is the question of inter-coder reliability. For a detailed review of or methods for calculating IR, please see here.

Therefore, before researchers start to "code" stories, they have to make sure that each of them would agree on as many cases as possible with respect to the presence or absence of a particular variable. This is the "systematic" part of this method. The process is absolutely tedious and arduous. but it also represents how credible the study is. Obtaining an acceptable level of reliability is therefore the basic requirement for getting a study published.

There are different ways of measuring inter-coder reliability. Some people calculate "percentage of agreement." This method has been criticized for being too lenient. Other approaches, such as Krippendorf's Alpha and Scott's Pi, expanded on the previous method and correct for the effect of chance. The effect of chance means that the coders have consistent results simply by accident. Such effect is especially salient if there are only two categories in the variable.

Which method to use depends on how stringent you want your research to be. But using a Scott's Pi is not necessarily superior than using percentage of agreement, especially when people start to criticize Scott's Pi for being too strict. One of my experience of using Scott's Pi is that it is really difficult to please. For instance, all the values in a variable (e.g., 1,2,3...) should appear in order for Scott's Pi to function. If, say, the president does not appear as a news source in all the stories coded and both you and your coding partners reached a 100% agreement on this, the Scott's Pi will still tell you that the reliability coefficient is not calculable. If, unfortunately, the president appears once in the news stories and only one of the coders catch it, the reliability coefficient will be extremely low, although the percentage agreement may be close to 100%. This is very tricky!

Most of the journals require the authors to report reliability for each variable coded. But some journals requires only an average coefficient, which saves a lot trouble because some bad variables can be averaged out by good variables.

Seeing the time-consuming and tedious process of content analysis, communication scholars have offer their sincerest suggestion for their successors--just don't do it!

For examples of quantitative content analysis, please see McComas and Shanahan (1999): Telling Stories About Global Climate Change and Nisbet et al. (2003): Framing science.

No comments: