# Misleading stats used frequently in reporting Covid-related figures globally: Local study

a year ago

Share on

**Notes both Government and non-government bodies have either knowingly or unwittingly misrepresented data**

**BY Ruwan Laknath Jayakody**Of the numerical and graphical components of various statistics in the form of count data, graphs and charts, indices and proportions that have been utilised and published to date by various governing and non-governing bodies, the latter including media institutions and the general public, plenty of misleading statistics have been released and used, intentionally or otherwise, a paper that conducted “A review on misleading Covid-19 statistics” found. This was noted in the aforementioned article by D.Y. Jayasinghe, P. Dias and C.L. Jayasinghe (all three attached to the Sri Jayewardenepura University’s Statistics Department) which was published in December 2020 in the

*Sri Lanka Journal of Applied Statistics (21) 3.*Statistics, Jayasinghe et al. explain, constitute a platform for data to speak out and provide information for decision-making based on data. However, misleading information conveyed through misleading statistics, used purposely or without knowledge, can, Jayasinghe et al. point out, lead to miscommunication and ultimately poor or incorrect decisions, as researchers arrive at conclusions with respect to the summary measures they get, and the inferences they make based on them. Of all the situations where statistics have misled viewers and readers, the Covid-19 outbreak, Jayasinghe et al. mention, is the latest instance where numerous statistical misrepresentations are utilised and concepts are misinterpreted by various governing and non-governing bodies. According to G. Pennycook, J. McPhetres, Y. Zhang, J.G. Lu, and D.G. Rand’s “Fighting Covid-19 misinformation on social media: Experimental evidence for a scalable accuracy nudge intervention”, giving regular prompts through various mass media platforms, on the idea of accuracy, might be sufficient to enhance people sharing decisions related to Covid-19 information and to ease the volume of misinformation regarding the virus.

**Misleading Covid-19 statistics**

**Count data**The total of certain variables of interest such as the number of deaths, the number of active cases, and the number of the total confirmed, amongst others, constitute, per Jayasinghe et al., one of the most primitive measures utilised to compare the severity of Covid-19, adding, however, that counts can be useful to show when the incidence is starting to recede as public health measures take effect in a particular population. As per the findings of N. Pearce, J.P. Vandenbroucke, T.J. VanderWeele, and S. Greenland in “Accurate statistics on Covid-19 are essential for policy guidance and decisions”, the shape of the portrayed trends in case counts enabled to see at the time (2020) that the UK, France, Italy, and Spain were on similar trajectories, whereas the Republic of Korea (South Korea) and other Asian countries were “flattening the curve”. “However, the use of these types of count data should be directly exempted when making comparisons due to several reasons. A comparison would be fair only when the compared individuals are kept under similar background conditions. The variable utilised in the comparison should be the only varied factor in order to make the comparison fair. It is not fair to compare apples and oranges or dollars and coconuts. Similarly, the same theory applies here. If we need to perform a country-wise comparison, it is incorrect to use only count data, since different countries have different land areas, populations with different age distributions, various lifestyles, different dietary habits, and different compositions with respect to gender, ethnicity, etc.,” Jayasinghe et al. elaborate. “Simply, if we compare the count of confirmed cases in the US with Sri Lanka, it is obvious that the US reports a higher number with respect to the geographical region of the country’s spread and thereby the total population in the country. Hence, a more valid comparison can be conducted by considering not the total number, but the total deaths per million. This eliminates the effect of the total size of the populations of the countries being compared.” For example, Jayasinghe et al. note the updated figures for 5 May 2020 morning, per the source – our world in data – which reveals how the country rankings change when the death counts and death per million counts are considered. The US moves from being ranked number one in the total number of deaths, to number nine when adjusted per capita with the caveat that the deaths in the US do not yet seem to have stabilised; Belgium makes the reverse move, going from number six in the overall deaths to being the country with the most deaths per million population; while the UK, following the inclusion of care home deaths, now has the third highest number of deaths in the world (per H. Krelle, C. Barclay and C. Tallack’s “Understanding excess mortality the health foundation”). Therefore, Jayasinghe et al. reiterate that it is clear that using count data for comparing the severity levels of the outbreak in two different geographical regions is misleading. Different tests are performed in order to detect Covid-19 cases, and in Sri Lanka, primarily, polymerase chain reaction (PCR) tests are conducted while antigen/antigent tests too are utilised. Distinct diagnostic tests, Jayasinghe et al. note, possess different testing accuracies, hence may reveal dissimilar results, and therefore, comparisons related to count data (e.g. the total number of cases detected) are more valid when detection has been done using the same test. Further, Jayasinghe et al. observe that “random testing” and “testing contacts” are two different testing strategies. “There is a high chance of getting a positive test result for close contacts of a Covid-19 case; hence, the positive rate is higher when the ‘testing contacts’ strategy is adopted. Conversely, the positive rate may be low for a randomly chosen sample, especially if the community transmission stage has not been reached”. Therefore, count data, Jayasinghe et al. conclude, could be misleading especially when conducting comparisons.

**Charts**For this, Jayasinghe et al. take a figure extracted from the Covid-19 Dashboard presented by the Health Promotion Bureau (HPB) on 24 December 2020. “Although the computed statistics are acceptable, having these graphs presented in the same place, without scaling the two graphs as a whole, seems misleading. For example, in the case of France, it has a fatality rate of 2.47% while the recovery rate is 7.47%. If we compare these rates of France itself, France has a recovery rate which is higher than the fatality rate. When these two graphs are presented simultaneously, the length of the bars tend to mislead the viewer by representing 2.47% by a longer bar than the bar representing the value 7.47%, on the chart. The aim of representing data via graphs should be allowing the viewer to grab the facts instantly without a thorough observance. This norm is violated in this instance, where it has to be counted in for the list of misleading Covid-19 statistics,” Jayasinghe et al. add. Furthermore, Jayasinghe et al. state that the fatality rate and the recovery rate have been computed in the following equations – fatality rate equals the total number of deaths divided by the total number of cases reported, and multiplied by 100% while the recovery rate equals the total number of recovered cases divided by the total number of cases reported and multiplied by 100% – which is why the summation of the two rates does not equal 100%. “The rates should be calculated from the resolved cases (i.e. excluding the cases which are still under treatment). Here, per the Health Ministry (2020), the total number of recovered cases was defined to be those who received negative results for two consecutive PCR tests.” Jayasinghe et al. present another figure which is a bar chart containing misleading data representation, which is published in the HPB website. “As per the partial image, it reveals that the number of PCR tests conducted on 19 February 2020 is null, however, there is some height indicated in the chart itself. If it was correctly represented, then there should not be a vertical bar,” Jayasinghe et al. observe.

**Indices**

- Koetsier’s “The 100 safest countries for Covid-19: Updated” (2020) has introduced a safety index to rank the 100 safest countries during the pandemic, in which they have allocated scores for every country with respect to six factors, namely: quarantine efficiency, Government efficiency of risk management, monitoring and detection, health care readiness, regional resiliency, and emergency preparedness – factors which were obtained by using several other variables as shown in another figure and the weighting schemes used, which are given in yet another figure.

**Proportions**Jayasinghe et al. explained thus: “The number of PCR tests conducted should be taken into account when comparing active cases on a daily basis. None of the countries perform a similar number of PCR tests per day including within the individual country itself. For example, if on the first day, a country performed 100 tests to figure out new cases and found five active cases and the same country, on the next day, performed 1,000 PCR tests and found only 10 positive cases, just because 10 is a larger value than five, it is unfair to conclude that the number of positive cases of that country has doubled its value. “To address this issue, proportions were taken into account. The following equation – the proportion of positive cases for a day equals the number of positive cases for the day divided by the number of PCR tests performed on that day – is the formula to compute the proportion of active cases. Let us take the statistics for a hypothetical example where, in day one, there were five positive cases found, 100 PCR tests performed, with the proportion of positive cases being 0.05 while on day two, 10 positive cases were found, 1,000 PCR tests were conducted, with the proportion of positive cases being 0.01. “With respect to the positive cases, day two is worse than day one due to the increased number of detected cases, but when the proportion is considered, the converse is correct, where day two is better than day one due to decrements in the proportion of positive cases. Therefore, proportion provides a better representation in such a scenario. However, when comparing the world data, it is important to see how countries give definitions to the terms. According to M. Roser, H. Ritchie, E. Ortiz-Ospina, and J. Hasell’s “Coronavirus pandemic (Covid-19)”, the informed number of tests conducted by a particular country does not refer to the same in each country – one difference is that some countries report the number of people tested, while others report the number of tests performed which can be higher if the same person is tested more than once. “PCR tests are performed not only to detect new cases, but also to detect cured cases after treatment. In such an instance, it is definitely misleading to compare countries making the proportion of positive cases as the bottom line. Therefore, if a comparison is needed, it is advisable to check the definitions of the variables given by the countries in order to avoid misleading representations.” In conclusion, the following aspects were highlighted: that the use of count data for comparisons under dissimilar background conditions is misleading; that data representation via graphs should be done so that the norms of presenting summarised data graphically are not violated and that they communicate the real scenario; that the computations of fatality rates and recovery rates should be improved in order to showcase the actual figures by excluding uncertain cases such as patients still under treatment who could recover or die; that using a common index or scoring system to compare all the countries which possess dissimilar geographic and political factors is highly misleading; that when comparing proportions of daily positive cases, considering only the results related to daily PCR tests conducted for the detection of new cases is imperative and that if the intention of conducting the test is disregarded, whether it is to diagnose the disease or detect recovery, it could significantly impact the actual figures and hence the decisions made; that comparisons should be made when comparing daily positive cases, with counts generated through the same diagnostic test; that since the test results are not reported on the same day, the number of positive cases and the number of tests performed cannot be matched; and that therefore, it is important to be mindful regarding the definitions of these measures and consequently their computations, in order to prevent the generation of misleading facts and figures and thus inaccurate decisions.