# What Is A Good Percent Agreement

Percentage match between multiple data collectors (dummy data). Here, reporting on the crowd and attributing disagreements is informative, while kappa obscures the information. In addition, Kappa introduces some challenges in calculation and interpretation, as kappa is a ratio. It is possible that the kappa ratio returns an indefinite value due to zero in the denominator. In addition, a ratio reveals numerator or denominator. It is more instructive for researchers to report disagreements on two components, quantity and allocation. These two components describe the relationship between categories more clearly than a single summary statistic. If the goal is predictive accuracy, researchers can more easily think about how to improve a prediction by using two components of quantity and allocation instead of a kappa ratio. [2] Graphical representation of the amount of correct data per % match or square kappa value.

If statistical significance is not a useful guide, what size of kappa reflects an appropriate match? Guidelines would be helpful, but factors other than matching can affect their size, making interpretation of a certain magnitude problematic. As Sim and Wright noted, two important factors are prevalence (are the codes equiprobable or do they vary their probabilities) and bias (are the marginal probabilities similar or different for both observers). When other things are the same, kappas are higher when the codes are equiprobable. On the other hand, kappas are higher when codes are distributed asymmetrically by both observers. Unlike variations in probability, the distorting effect is greater when the kappa is small than when it is large. [11]:261-262 There are several formulas that can be used to calculate match limits. The simple formula given in the previous paragraph, which works well for sample sizes greater than 60,[14] is as follows: if the number of categories used is small (e.B. 2 or 3), the probability that 2 evaluators agree by pure chance increases considerably.

Indeed, both evaluators must limit themselves to the limited number of options available, which affects the overall rate of agreement, and not necessarily their propensity for an “intrinsic” agreement (an agreement is considered “intrinsic” if it is not due to chance). Weighted kappa allows for different weighting of disagreements[21] and is particularly useful when ordering codes. [8]:66 These are three matrices, the observed score matrix, the expected score matrix based on random matching, and the weight matrix. The cells of the weight matrix on the diagonal (top left left) represent a chord and therefore contain zeros. Cells outside the diagonal contain weights that indicate the severity of this disagreement. .