GREEN SILENCE: DOUBLE MACHINE LEARNING CARBON EMISSIONS UNDER SAMPLE SELECTION BIAS
C. CHEN*, A. LIOUI** and O SCAILLET***
* Adam Smith Business School, University of Glasgow ** EDHEC Business School *** Université de Genève and Swiss Finance Institute
Abstract
Voluntary carbon disclosure collapses into a paradox of green silence: firms choose whether to disclose emissions based on strategic incentives, while high emitters may rationally exploit estimation bias by remaining silent. Mirroring the classic sample selection problem of Heckman (1979), this self-censorship skews observed emissions into non-random samples, distorting both climate risk pricing and climate policy analysis. We propose a Heckman-inspired three-step framework tailored to high-dimensional settings that corrects for strategic non-disclosure while ensuring variable selection consistency in the presence of sample selection bias. By integrating kernel group lasso with double machine learning and exploiting information from neighbouring firms, our empirical analysis of 3,444 U.S. firms over the period 2010–2023 strongly rejects the null hypothesis of no selection bias. Naïve imputation methods based on firm size or revenue substantially underestimate corporate carbon emissions relative to our selection-corrected high- dimensional approach. This underestimation translates into a $2.6 billion shortfall in carbon tax revenues and up to $525 billion in hidden social costs of carbon. Correcting for selection bias also implies a substantially larger carbon premium, highlighting the carbon pricing consequences of endogenous disclosure.
Keywords: carbon emissions, machine learning, sample selection.
JEL Classification: C12, C13, C33, C51, C52, C82, G11, G18, Q52, Q54, Q56, Q58.