MPC 06 Describe point-biserial correlation and phi-coefficient. (Marks 6)

 

Q. Describe point-biserial correlation and phi-coefficient.                                       6

Point-biserial correlation

Some variables in research are dichotomous. The dichotomous variable is the one that can only take one of two sharply distinguished or mutually exclusive categories. Some examples are, male-female, rural-urban, Indian-American, diagnosed with illness and not diagnosed with illness, Experimental group and Control Group, etc. These are the truly dichotomous variables for which no underlying continuous distribution can be assumed. They represent categories rather than measurements.

Now if we want to correlate these variables, then applying Pearson’s formula have problems because of lack of continuity. Pearson’s correlation requires continuous variables. When we want to assess the relationship between such a dichotomous variable and a continuous variable (e.g., income, test scores, age), using the regular Pearson correlation coefficient can be problematic. This is because Pearson’s r assumes that both variables are continuous and normally distributed, which is not the case here.

To address this, we use the Point-Biserial Correlation Coefficient (rpb). This is a special case of the Pearson correlation and is specifically used when one variable is truly dichotomous and the other is continuous. Mathematically, the formula for rₚb is equivalent to Pearson’s r, but conceptually, it acknowledges the categorical nature of one variable.

The dichotomous variable is typically coded as 0 and 1 (though any two distinct values can be used—e.g., 0 and 1, 5 and 11—the correlation result remains the same).

Point Biserial Correlation (rpb) is Pearson’s Product moment correlation between one truly dichotomous variable and other continuous variable. Algebraically, the rpb = r.  So we can calculate rpb in a similar way.

Phi-coefficient

The Pearson’s correlation between one dichotomous variable and another continuous variable is called as point-biserial correlation. When both the variables are dichotomous, then the Pearson’s correlation calculated is called as Phi Coefficient (ϕ).

Use the Phi-coefficient when both variables are binary (e.g., comparing gender and exam result, or smoking status and disease status).

If we organize the data in a 2×2 table

 

Variable B = 1

Variable B = 0

Row Total

Variable A = 1

a

b

a + b

Variable A = 0

c

d

c + d

Column Total

a + c

b + d

N (Total)

 Then the Phi-coefficient (ϕ) is calculated as:


 

Post a Comment

Previous Post Next Post