po polsku

Astronomical Observatory of the Jagiellonian University


Astronomy Object of the Month: March 2024

< previous Archive next >

Classification of galactic mergers with convolutional neural networks

Lambda-CDM cosmology assumes that galaxies form by hierarchical mergers of smaller structures, so mergers of galaxies provide astronomers with crucial information about the evolution of galaxies over time.

Illustration: Arp 87 system, or NGC 3808A and NGC 3808B: colliding galaxies photographed with the Hubble telescope. Source: APOD / NASA, ESA, Harshwardhan Pathak.

The merging process takes place differently depending on the properties of the colliding galaxies. Major mergers, whose components have a mass ratio of up to 1:4, when colliding at the appropriate speed and angle, merge their structures, which leads to a modification of their morphology. Then the specific features associated with the collision of galaxies - bridges, double nuclei and other tidal features, including tails and plumes of different shapes and sizes - can become apparent. In addition, major mergers can cause phenomena of rapid formation of new stars and activation of nuclei of active galaxies. However, the influence of mergers on these phenomena is still under debate, as it is not clear exactly what role they play compared to other physical processes, such as smooth gas accretion.

When the mass of one of the galaxies involved in a collision is significantly greater than that of the other galaxy, the merger process is smoother, and the smaller galaxy is usually absorbed by the larger one, leaving it almost untouched. Galaxy mergers are a significant factor in the study of the evolution of the Universe. It is estimated that mergers account for less than 10% of low redshift galaxies, with the percentage rising to 20% for those in the range of 2 to 3.

A major challenge in the study of galaxy collisions is detecting them with sufficient efficiency and completeness. Due to the wide range of morphological features of galaxy collisions, their visual classification is difficult in terms of consistent, reproducible application. The use of morphological parameters provides a way to determine morphology in a reliable and invariant way. They describe the shapes and concentration of light in the images. However, obtaining these parameters requires images with sufficiently high resolution. Another method - so-called close pairs - is more direct. It involves finding pairs of galaxies that are close together in the sky and at the same redshift. However, it requires expensive, long-term spectroscopic observations.

The result challenges the previous knowledge of pulsars. The traditional scheme, according to which particles are accelerated along magnetic field lines inside or slightly outside the magnetosphere, cannot explain the new observations well. We may be observing particle acceleration through a so-called magnetic reconnection process outside the light cylinder that somehow preserves the rotation pattern, but even this scenario faces difficulties in explaining how radiation of such extreme energy is produced.

With the upcoming new sky surveys providing astronomers with large amounts of data, including Euclid and LSST, it will be crucial to create algorithms to automate time-consuming and repetitive tasks, specifically such as identifying galaxy mergers. Recent studies have shown that it is possible to use convolutional neural networks (CNNs) to solve the problem of visual classification of galaxy mergers.

A recent comparative study of machine learning-based galaxy collision detection methods by an international team of astronomers aims to understand the relative performance of different machine learning methods within the same system (Margalef-Bentabol 2024 et al.). A total of six machine learning methods were tested, based on the same cosmological, gravomagnetohydrodynamic IllustrisTNG simulations. These allowed the generation of images that mimicked the real data. All of the grids that have not been pre-trained on galaxy images yield similar results, despite being built on different architectures. This may indicate that preprocessing the training data is a more important factor than choosing the parameters of the neural networks.

This aspect is now also being studied by astronomers from Jagiellonian University Astronomical Observatory, testing and comparing the performance of convolutional networks trained on original and processed data. They match Sersic profiles - the dependence of light intensity on distance from the galactic center - to synthetic images of galaxies, and then subtract them from the original ones. This process creates so-called residual images, where anything that doesn't fit the matched profile remains. This allows subtle features of galactic mergers, such as diffusion structures or tidal features, to be highlighted. As a result, it is possible to train three different networks with the same architecture on three different data sets - the original images, the fitted Sersic models and the residual images.

The results show that the network trained on the original data performs best. Its overall accuracy - the number of correctly classified images - is 74%. The network performs better at identifying non-colliding galaxies. 82% of them are correctly identified, while for colliding galaxies this number drops to 64%. The network trained on the Sersic models has similar performance in recognizing non-mergers: 80% of them are correctly classified. In contrast, it performs much worse at identifying mergers, correctly predicting only 56% of them. The network trained on residual images shows different characteristics - it correctly classifies 67% of images showing intact galaxies and 66% of images of colliding galaxies, making it the most effective network in correctly classifying mergers. The last two networks in general have similar accuracy, classifying correctly about 69% of all images.

By applying machine learning methods, it was possible to determine that the classification of galaxies into mergers and non-mergers is possible using both the weak diffusion structures present in the residual images and the spatial information contained in the Sersic profiles. The next step will be to adapt networks trained on synthetic images to real astronomical data, as it turns out that networks trained on simulation inputs perform much worse in evaluating real images. Techniques that focus, among other things, on finding common features in the domain of the images used for training and evaluation appear to be key in this aspect.

Original publication: in preparation.

The findings described are part of a study conducted in the Department of Stellar and Extragalactic Astronomy of the Jagiellonian University Astronomical Observatory in Kraków.


Dawid Chudy
Astronomical Observatory
Jagiellonian University
Dawid.Chudy [@] doctoral.uj.edu.pl