Background removal from global auroral images: Data driven dayglow modelling

We present a data driven dayglow model for global auroral images. l The method uses robust statistics and automatically deselects auroral pixels. l The method allows for slow temporal variations


Introduction
Global auroral cameras are key tools in revealing the complex interplay between the magnetized solar wind and Earth.While insitu instruments on satellites provide high-precision measurements from a localized region, remote sensing provides largescale measurements of the state and evolution of the coupled magnetosphere−ionosphere system.
The first images of the aurora taken from space were obtained by the Isis-2 scanning photometer, which sampled large fractions of the polar region in visible wavelengths during each orbit (Anger et al., 1973;Lui et al., 1975).Before this, the full auroral oval could only be examined statistically by combining numerous groundbased observations from limited regions (e.g.Vestine, 1944;Feldstein, 1964).With KYOKKO came the first camera in the Far-Ultraviolet (FUV) range, enabling measurements also in the sunlit hemisphere (Kaneda et al., 1977;Hirao and Itoh, 1978).The first truly global images were obtained by the Dynamics Explorer 1 (DE-1) mission (Frank et al., 1981).It had the entire oval within the fieldof-view for up to 5 hours, which provided direct observations of how the polar cap expands and contracts as the magnetospheric system responds to external solar wind forcing (Frank and Craven, 1988).Since then, Viking (Anger et al., 1987), Akebono (Oguti et al., 1990), Polar (Acuña et al., 1995;Frank et al., 1995;Torr et al., 1995), Interball-2 (Zelenyi et al., 1997), and IMAGE (Burch, 2000;Mende et al., 2000) have provided large-scale images of the aurora with temporal resolutions on the order of seconds or minutes from high Earth orbits.In addition to these high altitude imagers, several satellites in low Earth orbits have been able to scan the aurora in different wavelengths, providing one image per orbit with high spatial resolution.These imagers have greatly improved our knowledge of the auroral morphology and dynamics, and have revealed distinct auroral features in the main oval (e.g.Cogger et al., 1977), equatorward of the oval (e.g Anger et al., 1978;Elphinstone et al., 1993;Burch et al., 2002;Immel et al., 2002) and in the polar cap (e.g Frank et al., 1981Frank et al., , 1986;;Murphree et al., 1990;Frey et al., 2003a;Zhang QH et al., 2021).Due to the strong coupling to the magnetosphere, auroral observations have been used to examine meso-scale structures in the magnetotail (Sergeev et al., 2000;Zesta et al., 2000) and asymmetries in the magnetospheric system (Østgaard et al., 2003, 2011, 2018;Ohma et al., 2018;Laundal and Østgaard, 2009).Auroral imagers also provide quantitative measurements of the characteristic energy and energy flux of precipitating particles, and from these the ionospheric conductance (Kamide et al., 1986;Lummerzheim et al., 1991;Germany et al., 1994a;Germany et al., 1994b;Frey et al., 2003b).While incoherent scatter radars (in particular) or particle measurements from spacecraft can provide these quantities with higher certainty, the strength of global cameras lies in their ability to provide instantaneous estimates over a large region with good spatial and temporal resolution (Brekke et al., 1993).
Large-scale FUV images of the aurora are thus a key tool in space physics.There are, however, two main sources of the ionospheric FUV emissions observed from space: precipitating particles (which produce the aurora) and photoelectrons produced by sunlight.The sunlight-induced dayglow is most intense at the subsolar point, and decreases as the solar zenith angle increases.The intensity of the dayglow also depends on the flux of the extreme ultraviolet (EUV) solar radiation and on the atmospheric composition, which affects both production and absorption of dayglow (Meier, 1991).The solar radio flux at 10.7 cm (F 10.7 ) is usually used as a proxy of the EUV flux.In addition to dayglow, nightglow emissions and scattered sunlight from the atmosphere contribute to the background signal in specific wavelengths.The angle at which the emissions are observed, , also affects the measured intensities; if emission lines that are optically thin (not absorbed by the atmosphere) are observed at an angle, the effective volume of the emission region is larger compared to observations from nadir, causing a stronger observed signal.
It is often necessary to remove all non-auroral emissions from the images when analyzing the auroral component.Many auroral phenomena appear in the sunlit hemisphere and fair comparison of the extent and intensity of auroral features between different locations or different times can only be done if the full background is removed.The same applies to statistical considerations and quantitative estimations based on FUV images, which warrant the need for robust background modelling.The dayglow intensity can be modelled from first principles (Strickland et al., 1999) or empirically via observations.Liou et al. (1997) binned Polar data from the Ultraviolet Imager (UVI) below 60° magnetic latitude on and fitted a cosine function to make a smooth background model.Wang LM et al. (2018) fitted to individual images while ignoring emissions at typical auroral latitudes, and then used hourly averages of and to construct a model that depends on F 10.7 , universal time and day of year.Nicholas et al. (1997) made an empirical background model for DE-1 by binning 178 images obtained during magnetically quiet periods by both and and then fitting a surface to the binned data.A similar approach has been applied to the Global Ultraviolet Imager (Paxton et al., 1999;Christensen et al., 2003) and the Special Sensor Ultraviolet Spectrographic Imager (Paxton et al., 1992) in low Earth orbit (Zhang Y and Paxton, 2008;Zhang YL et al., 2022): A reference model is constructed using quiet time passes while considering both solar zenith angle and look angle of each pixel.This reference model is then scaled to individual passes by minimizing the misfit to the observed emissions, thus accounting for variations in dayglow intensity caused by e.g.variations in solar radiance.While Earth's dayglow is mainly aligned with the solar zenith angle, it is also affected by atmospheric conditions.During geomagnetic active periods, heating of the upper atmosphere leads to upwelling of N and O , and a subsequent depletion of atomic O.This depletion is not symmetric around noon, but is most pronounced at dawn (Craven et al., 1994).This causes dawndusk asymmetries in the dayglow.For this particular reason, Immel et al. (2000) included two extra angular dependencies in their empirical model of DE-1 images in addition to and : A phase angle orthogonal to the solar zenith angle and an azimuthal angle relative to the subsolar longitude.This allowed their model to also capture asymmetries in the background emissions.
While the aforementioned methods make models from a collection of images, background emissions can also be modelled for each image individually.Lummerzheim et al. (1997) used only nadir looking images and binned each image by the solar zenith angle.They then calculated the mean intensity in each bin while ignoring emissions from auroral latitudes, and subtracted these mean values from the original image to get a background-corrected image.Reistad et al. (2013Reistad et al. ( , 2014Reistad et al. ( , 2016) followed a similar approach, but binned the intensities in each pixel by to account for viewing geometry and used the binned median instead of mean.Li X et al. (2004) fitted the function to Polar UVI images and Laundal and Østgaard (2009); Laundal et al. (2010b) fitted a 2D polynomial to IMAGE and Polar FUV images while ignoring emissions from auroral latitudes.The background intensity has also been determined regionally by dividing images in circular sectors around the magnetic pole and then fitting a Gaussian/double Gaussian (intended to represent the aurora) plus a second order polynomial (non-aurora) to the latitudinal intensity profiles to find the boundaries of the oval (Carbary et al., 2003;Longden et al., 2010;Laundal et al., 2010a).
In this paper, we demonstrate a data driven approach to model non-auroral emissions in global FUV images using robust statistics while allowing temporal variations in the background model.The method can be applied to historical missions and is relevant for upcoming missions such as the Solar wind Magnetosphere Ionosphere Link Explorer (SMILE) mission (Raab et al., 2016).To demonstrate the model, we use data from the three FUV imagers on IMAGE.The data will be introduced in the next section.We describe the approach in Section 3 and discuss benefits and limitations of the proposed method in Section 4. Python code to use this method and the data set used in this publication are publicly available (Ohma et al., 2022a(Ohma et al., , 2022b)).

Data
We use global images obtained by the Far-Ultraviolet (FUV) cameras (Mende et al., 2000a) on board the Imager for Magnetopause-to-Aurora Global Exploration (IMAGE) mission (Burch, 2000), which were operational during 2000-2005.IMAGE had a spin period of about 123 s and auroral images was obtained once during each spin period when the detectors scanned Earth.Three cameras in the FUV range monitored the aurora: the Wideband Imaging Camera (WIC), which was sensitive to emissions in the Lyman-Birge-Hopfield (LBH) range (Mende et al., 2000b), and two spectographic imagers (Mende et al., 2000c), one sensitive to the Ohma A et al.: Background removal from global auroral images: Data driven dayglow modelling Doppler shifted 121.8 nm hydrogen line (hereafter SI12) and one sensitive to the 135.6 nm oxygen line (hereafter SI13).WIC had an integration time of 10 s, whereas SI12 and SI13 had an integration time of 5 s.To illustrate the method, we use data from one orbit, spanning from 07:24 to 16:38 on 28 August 2000.Figure 1 shows images obtained at 09:40:57 using the three cameras.
The raw images obtained by the cameras are corrected to account for temperature and voltage differences throughout the lifetime of IMAGE.A part of this processing is a flat-field correction, which takes into account sensitivity differences in the direction parallel to the spin axis of IMAGE (vertical direction in Figure 1).The images have highest sensitivity near the center and weaker sensitivity at the top and bottom.The values of the flat-field correction were found by examining the response of each camera to dayglow at middle latitudes when IMAGE was near perigee (Frey et al., 2003b).The intensity of all images are then scaled by to produce a corrected intensity .However, does not only consist of the real signal that the detector was exposed to, but also a constant noise level .The original scaling is therefore (1) where and are the row and column of the detector as oriented in Figure 1.
displays the WIC image in Figure 1a with the original flatfield correction applied.The colormap is highly saturated to highlight the background.There are clear intensity differences across the detector in the vertical direction, not related to aurora or dayglow.This difference is also seen in the upper right part of the image, which observes background space with no emissions.It is thus clear that the flat-field correction is not working perfectly.However, simply omitting the correction causes a significant increase in the spread in regions with a real signal (dayglow and aurora).The flat-field correction is thus necessary, but we believe that it should not be applied to the static background noise in WIC, only to the signal.To fix this issue, we assume that is constant across the image and take the median of all pixels with solar zenith angle larger that 100 (dark hemisphere) or pixels not pointing at the Earth to determine its value.We then remove the noise before we apply , such that (2)

C
We have chosen to add after the flat-field correction to keep the intensities similar to the original implementation, but this is not strictly necessary.The result is seen in Figure 2b, indicating a clear improvement of the intensity difference across the detector.This change is only applied to WIC, as we have not seen indications of

Background Removal
In this section, we describe a data driven model that solves for the non-auroral background counts in the time period spanned by the input FUV images.We first describe how we model the part of the dayglow aligned with the solar zenith angle while also accounting for viewing angle, which models most of the background signature.We then describe how we model large-scale residuals and typically asymmetric background.

B-spline Model
The modelling scheme described here is a development of the method proposed by Ohma et al. (2018).First, we use that the intensity of emissions in the FUV range approximately follows when seen from nadir, where is the solar zenith angle.Second, we take into account that the pixels in the detector generally observe the ionosphere at an angle.The dayglow emission lines considered in this study are all optically thin.The entire height segment where the emissions are produced thus contribute to the observed signal.When the emissions are observed at an angle, the effective height of this emission column .
Here, refers to the viewing angle of the observation.Based on these arguments, we choose to model as a function of .(aurora) and a monotonically increasing trend for (dayside).The dayglow intensity can be modelled as a spline function, which again can be expressed as a linear combination of Bsplines (c.f.de Boor, 2001) (3) where is the magnitude of the corresponding B-spline .As described below, the temporal dependence is modeled using a separate set of B-splines.We therefore refer to the B-splines in Equation ( 3) as spatial B-splines.Each of these spatial B-splines are non-zero at an interval along specified by the degree and knot locations used.The spatial B-splines used to model are displayed in Figure 3b, and are of order 3 with dense knot locations (vertical red lines) around zero to accommodate the fast transition of across the terminator.In addition, there are exterior knots at is seen at all time steps, but there are clear variations in the intensity.There are rapid jumps between subsequent images, in addition to a general reduction of the observed intensity at the dayside from the beginning to the end of the event.As there can be temporal variations in the dayglow intensity, it is beneficial to have a model that allows for slow variations in the modelled intensity.Ohma et al. (2018) took this into account by allowing the spatial B-spline coefficients to vary linearly in time.Here we instead use temporal B-splines, which allows for a much more flexible model.To do this, we rewrite Equation (3) as where the coefficients are the magnitude of the B-splines and .If temporal knots are located only at the first and last time step of an interval, a temporal order of zero yields a constant dayglow model.An order of one yields a linear time variation, an order of two a quadratic time variation and so on.Knots can also be placed within the modelled time interval, allowing even finer temporal fitting.We note that the model will only be smooth in time for a temporal order of two or greater if there are knots within the considered time interval, otherwise the model will be discontinuous (order zero) or have infinite gradients (order one) at the knots.In this study, we have used temporal Bsplines of order two with an approximate 140-min knot separation (evenly spaced knots between the first and last time step in the event) as displayed in Figure 4b.The locations of the interior knots are indicated by the vertical red lines.

a nl
The goal is to obtain the model coefficients that best describe the background at any given time.We write the forward problem as where is a column vector with the observed intensities and is a column vector containing the model coefficients .The matrix describes the linear relationship between and given by Equation ( 4), commonly referred to as the design matrix.We use a combination of iterative reweighting and zeroth-order Tikhonov regularization to reduce the influence of auroral emissions and avoid overfitting.We thus model as the leastsquares solution of The regularization parameter determines the trade-off between minimizing data misfit and the model norm, and is the identity matrix.The diagonal matrix represents the weight of each observation and is decomposed into .The diagonal matrix contains weights related to data coverage to reduce spatial bias, defined as the inverse of the number of measurements in equal bins along at each time step.The diagonal matrix contains weights that are updated iteratively to reduce the influ-k ence of outliers such as aurora.We use Tukey weights (Beaton and Tukey, 1974), where the weight of each measurement is defined as Here, is the misfit given by , is the spread of the observations and is a scaling factor we have set to 5. It is common to use a single value for , often the root−mean−square error (RMSE).However, the noise in our model is clearly heteroskedastic along .To accommodate this, we first calculate the RMSE in bins; a single bin for and 15 bins for .We use all the images in the event and assume that the noise is constant with respect to time.To find a continuous noise model, we assume that total variance is proportional to , such that where and represent the relative and absolute variance of the detector.The two coefficients and are found as the leastsquares solution when putting the binned RMSE on the left side of this equation.We then get a smooth estimate of the variance by inserting the two parameters into Equation ( 8), which is subsequently used in Equation ( 7).Tukey weights set outliers to zero.We believe that this is more appropriate than weighting schemes that only reduce the importance of outliers such a the commonly applied Huber weights (Huber, 1964), as the main outliers in the background model are not observational extremes but rather observations of a different distribution (aurora) than the one we are trying to model (dayglow).The model is updated iteratively until the relative change of the 2-norm of between two iterations is less that .The regularization parameter is found using standard Lcurve analysis (Hansen, 1992).For the time interval considered in this study, reasonable values for are 0.01, 0.1 and 0.1 for WIC, SI12 and SI13 images, respectively.When the final is determined, we get the modelled dayglow intensity using Equation (4).
The performance of the B-spline (BS) model is displayed in Figures 3c, 3d, 4c, 4d and 5. Figure 3c shows the final BS model in orange together with the observed intensities.The color of each point represents its weight after the final iteration.Here we see that the auroral pixels have been down-weighted and that the noise acceptance is greater at the dayside compared to the nightside.Figure 3d displays on a logarithmic scale for the initial (blue) and final (orange) iteration.Here we see the effect of the iterative scheme; the background is initially over-estimated near due to aurora, but this effect is reduced as the auroral observations are down-weighted.In Figure 4c we see the temporal evolution of the background model, which clearly captures the gradual change seen in Figure 4a. Figure 4d displays how the final model coefficients (cyan) to (pink) vary in time.This is thus the temporal evolution of the magnitude of the spatial B-splines in Figure 3b.

Residual Background Model
As evident from Figure 5, systematic residual background can be present in the B-spline corrected images.This can be related to real differences in the non-aurora emissions, due to for instance dawn-dusk asymmetries in the composition of the upper atmosphere following active periods.Instrumental effects, stray sunlight entering the detector, errors in the viewing angle correction and/or errors in the geolocation of the images can also contribute to systematic residual background.
We model this residual background intensity using a spherical harmonic (SH) model Here, and are the order and degree of the SH, respectively.The is the longitude relative to the geographic longitude of the θ g subsolar point, whereas is the geographic colatitude.The coefficients are the amplitudes of the surface waves, where the B-spline allows for temporal variations.We have used the same degree and knot locations as the BS model (Figure 4b).The goal is to identify the model coefficients that best describe the residual background intensity via the forward problem: To make the noise approximately white, we divide each row of both and by the corresponding obtained in the B-spline model.To avoid fitting the aurora, we only use low-order terms ( ).Since we only model one hemisphere we do not need terms that describe interhemispheric asymmetries, and therefore we only consider terms where is even.As with the B-spline model, the least-squares solution is used to find .Again, the diagonal matrix represents the weight of each observation.The data coverage part is The SH model applied to the images obtained at 09:40:57 UT on 28 August 2000 is displayed in Figure 6, in the same format as Figure 5. Especially in the WIC image, it is clear that the SH model captures the systematic dawn-dusk differences present in the Bspline corrected image.The SH corrected images thus have a more even background compared to the B-spline corrected images with no SH correction applied.

Discussion and Conclusions
In the foregoing sections we have described a data driven method to remove non-auroral emissions from global FUV images.The method is implemented in python, and is publicly available (Ohma et al., 2022a).The method has several advantages: It combines many images in two inversions while still allowing for (slow) temporal variations, which ensures robust statistics due to the abundance of data points used when constructing the models.This means that the model is also reliable when only a fraction of the auroral zone is observed, since the conjunction of many images provides a global view.This is seen in the first hour of Figure 4, where the modelled intensity matches the mean observed intensities despite the small region covered.Furthermore, the SH model captures potential dawn-dusk asymmetries in the background emissions, which could be critical during active periods.The method is highly flexible, and it is straightforward to adjust the location of knots and the order of both the spatial and temporal B-splines and to adjust the order and degree of the SH model.The final model is smooth everywhere (including across the terminator) and the iterative de-selection of outliers such as aurora means that we do not have to presume the location of the

1/cosα d
Since we use B-splines to model the main dayglow signal, we do not presuppose the location of the terminator nor the functional form of the dayglow relative to the solar zenith angle.We do, however, assume that the correction correctly accounts for viewing geometry.Since this term goes to infinity at 90°, only the part of images below some upper limit should be used (e.g. between 70-80 degrees).The geolocation is also very uncertain for large viewing angles.For imagers that observe optically thick emission lines, like the Visible Imaging System Earth camera on Polar, no viewing angle correction should be applied.By correcting for viewing angle, we map two dimensional images to a single dimension, which we then model based on B-splines.The presence of large scale residual features in the BS corrected images indicate that this mapping is not perfect.We have therefore introduced a second inversion based on spherical harmonics where we consider the full two dimensional nature of the images, which reduces such large-scale residuals.In the presented event, the SH model is up to 10% of the BS model in the sunlit hemisphere, depending on camera and location.The SH model intensities are weak in SI12 and SI13, but comparable to the intensity of the dayside aurora in WIC.This should not be regarded as a general result, since the magnitude of the SH model varies and the dayside aurora can be weak or even absent in other events.In a future project, we will consider the full two dimensional problem (three dimensional when considering the temporal dimension) by using B-splines defined on a sphere (Schumaker and Traas, 1991).If successful, the full dayglow can be modelled in a single inversion.
As evident from Figure 4, there are rapid fluctuations in the observed intensities with time.By design, the background model does not capture such rapid changes.It is possible to pick up these fluctuations by applying the described method on individual images, similar to other dayglow fitting schemes used before (e.g.Lummerzheim et al., 1997;Li X et al., 2004;Laundal and Østgaard, 2009;Reistad et al., 2014).However, unless the fit is consistent for every image and the fluctuations are purely caused by differences in the baseline of the images, it is not possible to separate whether differences between two images are a result of differences in the signal or difference in the background detection.By enforcing only slow variations, we ensure that any rapid change is due to changes in the signal alone.This will occasionally lead to more residual background in the corrected images, but also a more robust model and easier interpretation when considering temporal changes.In the BS model, we model as a function of to account for heteroskedasticity.We also divide by this quantity in the SH model to normalize the noise.An alternative way to account for heteroskedasticity is to perform a variance-stabilizing transformation.A suitable transformation can be found empirically by e.g.finding the best power transformation using the Box−Cox test (Box and Cox, 1964).If the functional form between the variance and the mean is known or can be reasonably assumed, the transformation providing homoskedasticity is proportional to , where is the function relating the mean to the variance (Bartlett, 1947).If the spread is simply proportional to the signal (relative error), this yields a logarithmic transformation.This transformation actually performs well when applied to WIC, but not when applied to SI12 and SI13.However, if there is both an absolute and a relative spread, like we assumed in Equation ( 8), the appropriate transformation becomes proportional to , where is the observed intensity.This means that both and must be known a-priori.
For SI12 and SI13, there is an additional instrumental effect to consider; the two detectors have a low dynamical range which means that the observed intensities are quasi-discrete.The separation between measured intensity levels are equal in linear space, but any transformation will break this.For instance, a logarithmic transformation will lead to a large separation between the discrete levels for low intensities and a small separation between the discrete levels for high intensities, making it hard for the model to converge.
The background-corrected images contain many pixels with negative intensity (Figures 5 and 6) due to spread in the original signal.Since negative auroral emissions have no physical meaning, they should be interpreted as regions without such emissions (zero intensity).However, if images are binned or fitted as part of a quantitative analysis, negative intensities should be included to avoid overestimation of the binned or fitted intensities.

2
The method outlined in this text can be applied to both previous and future missions with little or no adjustments to the algorithm.It can be used to remove background emissions from the auroral emissions in order to more easily examine the auroral topology, intensity and dynamic evolution.The combined BS and SH models also provide a smooth representation of the dayglow emissions, which can be used to infer the column density of O/N when applied to appropriate FUV emission lines (e.g.Meier, 1991Meier, , 2021)).We hope the presented method can be a useful tool when analysing large-scale FUV images, for example for the upcoming SMILE mission which also will have a camera in the LBH range.

I
Figure 3. (a) Observed WIC intensities at 09:38:55, 09:40:57 and 09:43:00 on 28 August 2000 sorted by .(b) Spatial B-spline functions used to model the background.The vertical red lines indicate the locations of interior knot points.(c) Background model (orange) and the weight of the WIC observations in Figure 3a after the final iteration.(d) Modelled background intensity (solid) for the first (blue) and final (orange) iteration on a logarithmic scale.The dashed line shows .

Finally,
Figure 5 shows the BS model applied to the images obtained at 09:40:57 on 28 August 2000.From the top, the rows display WIC, SI12 and SI13.From the left, the columns display the Ohma A et al.: Background removal from global auroral images: Data driven dayglow modelling original image projected on Earth assuming a height of 130 km, the model, the corrected image and the weights after the final iteration.It is clear that subtracting the model from the observed intensities successfully removes most of the background and that auroral pixels are successfully deselected as outliers.

Figure 4 .
Figure 4. (a) Temporal evolution of binned mean intensities observed by WIC.The time is relative to the first image of the event (07:24:12 on 28 August 2000).(b) Temporal B-spline functions used to model the background.The vertical red lines indicate the locations of interior knot points.(c) Temporal evolution of the BS model after the final iteration.(d) Temporal evolution of the model coefficients, where the colors of each line corresponds to the spatial B-splines in Figure 3.

Figure 5 .
Figure 5. BS model applied to WIC (top), SI12 (middle) and SI13 (bottom) images obtained at 09:40:57 on 28 August 2000.From left to right, the columns display the original images projected on Earth, the BS models, the BS corrected images and the weights after the final iteration.

Figure 6 .
Figure 6.SH model applied to WIC (top), SI12 (middle) and SI13 (bottom) images obtained at 09:40:57 on 28 August 2000.From left to right, the columns display the BS corrected images, the SH models, the SH corrected images and the weights after the final iteration.