Toward approaches to big data analysis for terroristic behavior identification: child soldiers in illegal armed groups during the conflict in Donbas region (East Ukraine)
В статті розглядається підхід до використання великих даних (контенту соціальних мереж) для розуміння соціальної поведінки в зонах конфлікту і аналізу групової динаміки в нелегальних озброєних угрпуваннях. Аналіз спрямовано на ідентифікацію неповнолітніх учасників збройних формувань. Запропоновано ймовірнісний і стохастичний методи аналізу, а також класифікації, кількості, структури і динаміки нелегальних озброєних угрупувань. Використано дані, що стосуються антитерористичної операції на Донбасі (Східна Україна, 2014-2015 рік). Обчислено кількісний розподіл учасників незаконних збройних формувань за віком, гендерним складом, походженням і соціальним статусом. Запропоновано висновки щодо доцільності застосування описаного методу в кримінологічній практиці, а також можливості інтерпретації отриманих результатів в контексті вивчення тероризму.
In criminological practice and research of terrorism, there are many cases requiring application of sophisticated scientific instruments. Not only at the stage of analysis of evidences, but also at the preliminary stages, in particular, at the stage of crime identification.
Description of criminal activity and identification of a crime is a challenge in some cases, for example, in the areas of crisis, conflict, and fighting. This is due to significant limitations of existing information and data available.
In such cases it is necessary to use many different sources of information, including social networks, with adapted statistical approaches to assessment of this data. Correct statistical methods of data collection, analysis, filtering and regularization in such cases are critical.
Obtained with the mathematical methods, robust spatial-temporal distributions of data could be used to define the event and to identify a crime.
Social networks reflect the motivations of the behaviour of different groups of society and varied social environments (Wasserman, Galaskiewicz, 1994). So, because of the large scope, it is a good base for sociometrics and behavior analysis (Krause, Croft, James, 2007). However, formal numerical methods of social network data analysis are still not sufficiently developed for a wide range of important cases, in particular for crisis management (Lerbinger, 2012) and conflict analysis (Scott, 2012).
The task of extraction of structured distributions of data regularized by determining parameters from the large sets of non-structured data is solving. The algorithm based on specifics of dig data distribution, and on data source characteristics (e.g. group behavior). Realization of this approach aimed to detection of stable indicators of criminal and/or terroristic activity. At the same time the analyzed data and cyber activity, producing it might be legal in most cases (if propaganda is beyond our scope).
The proposed algorithm allows to analyze the content of social networks on the base of the set of selected indicators. These indicators allow to control the social dynamics of different social groups represented in social networks and analyze their behavior, including identification of evidences of terrorism.
Assuredly, social media activity in itself is not a terrorist in the strict sense, since it is not a method of achieving of political goals using a direct violence (physical or psychological). However, an important evidence of terrorism is its demonstrative character. The attack requires a nationwide, or ideally a global audience. Therefore, an information component of terroristic activity is extremely important.
In particular, there are several main objectives of the information campaign of terrorist activity. This is propaganda of impotence of central power and calls for the creation of alternative authorities. Second, this is making the precedents of active disobedience and military confrontation with a central power. Third, this is dissemination of appeals to the people to join the active opposition to the authorities, glorification of terrorists, promoting their ideas and lifestyle. Also, positive information about terrorism activates any local power and social mood, the opposed government, including distanced from terrorist tactics before. Additionally, the attack is treated as an indisputable sign of the acute crisis in the society. All this is pushing the society and the power to make concessions to political forces that use terrorist tactics. Terrorism and its propaganda make strikes on the economy, reduces the investment attractiveness of the country, degrades the image of the country, pushing the country to the radicalization of the political course, to authoritarian forms of government (often this evolution is a purpose of terrorists).
Terrorism is the most dangerous and the most effective (by the criterion of the invested resources / result) way to the political destabilization of society. And also this is one of the most effective destabilizing tools of the enemy within the modern hybrid conflicts.
At the same time, the use of terrorist tactics presupposes a set of specific socio-cultural and political characteristics of the target society. Terrorism is a phenomenon inherent to the crisis stage of the modernization of society. Commonly, the completion of modernization transformations removes grounds for terrorism. Terrorism occurs at the boundaries of cultures and periods of history.
Terrorism is an indicator of the crisis processes. This is the "emergency channel" of feedback between society and government, between a separate part of the community and society as a whole. It indicates to the acute trouble in a certain area of social space. In this respect, terrorism have not a pure power, police solutions. Localization and suppression of terrorists is only a part of the job. Another part involves political, social and cultural changes, which should remove the reasons for the radicalization of society and base of the terrorism.
In this context, the analysis of social dynamics and the behavior of social groups is becoming an important factor in the control of terrorist activities. Not only and not so much as for the identification of the criminals, but the identification of the crime, the identification of its social base, to determine the causes and driving forces of terrorism is a task to be solved by approaches proposed in this study.
In this paper, using the data on the conflict in some regions of Lugansk and Donetsk regions (Eastern Ukraine) shows the algorithm and method of data collection, filtering and regularization in tasks of child involvement in illegal military groups.
In other words, it shows how to use statistical algorithms in tasks of criminology and terrorism study for decision-making under deep uncertainty.
Antiterrorist operation - the armed conflict in the Donbas (Eastern Ukraine – see Figure 1) extends in spatially distributed urban agglomerations, e.g. with widely available and accessible information infrastructure and communications. The conflict zone is an area about 16,800 square kilometers with varying number of people: from 2.1 to 2.6 million up to 3.1 in maximum, depending on the intensity of fighting, economic situation and the season. In this area is available 990,620 telephone lines, 1,786,300 Internet lines (including optical fiber), there are 1,196,820 Internet users, 3,705,750 mobile phones serviced by 5 providers.
While traditional media functioning under external pressure and censorship, and often using as an "information weapon", social media, first of all, social networks have become an important source of information about the development of conflict. In this case, the question of the correct methods of collecting, processing and analyzing data from social networks is important in the context of analysis of the conflict.
In the conflicts and crises the accurate, timely and adequate information acquiring is a daunting challenge. Usually, parties of the conflict use different tools of censorship, all public information is distorted, and access to the crisis area for observers is usually limited. In such situation, social networks became a good source of operational information.
Psychological features and characteristics of group behavior of social network users have determine the distribution of data and information flows (Lerman, 2013).
Usually, when we consider the conflicts of values, and also in many cases of racial, religious and ethnic conflicts, there are stable sets of motivations of group behavior, presented in the form of symbols. It serves as a justification for violence and are widely used in propaganda. These symbols are the value-markers of certain social groups. In social networks, these symbols are distributed in the form of a hashtag, and are convenient search criteria for gathering information (Sultana, Paul, and Gavrilova, 2015).
In each case, the information contained in social networks is incomplete, with high noise, distributed by interconnected clusters. So, we should apply robust statistical algorithms for collection, filtering and classification of information using a set of interrelated criteria. Furthermore, the classified information should be regularized to obtain spatial-temporal distributions with controlled reliability. As result we obtain a coherent description of a situation in which all data will be statistically related to each other and mutually verified. Only in this case our data will be meaningful and has a sense as evidence, rather than fragmentary information from unknown and unconfirmed sources (Caverlee, Cheng, Sui, et al, 2013).
The first stage is the search and collection of data by identifying hashtags on the area of conflict, location, group membership, age/date of birth.
By the requested hashtag over 21,500 profiles in social networks (VK, Facebook, Instagram, Twitter) in five languages (Russian, Ukrainian, English, Serbian and German) has been analyzed. Following hashtags were used: #донбасс(Donbas), #новороссия(NovoRossia), #юныегероиновороссии(NovoRossiaYoungHeroes) #героиновороссии(NovorossiaHeroes) #память(Memory). More than 480 000 entries had satisfied to determined criteria. After filtering by age (12-18 years) / date of birth (after 1998) and of belonging to an illegal military group, has been collected about 5400 entries.
A classification procedure (1)-(3) has been applied to these entries. After it data has been regularized by a two-stage procedure (4)-(9).
The used method allowed to obtain an overview of distribution of key characteristics of illegal armed groups in the conflict zone. There were identified from 40 to 60 separated armed groups during the observation period. All of them are significantly varied by weapons, equipment, ideology (Soviet patriots, Russian imperialists, neo-Stalinists, anarchists, neo-Nazi, Christian orthodox extremists), and united only anti-Western ideas and the pro-Russian rhetoric.
They acted quite independently. The opinion of the nominal leaders of the region determined their actions no more that they are dependent on the supply of arms, ammunition, fuel and food from Russia via the central leadership. However, to the end of the observation period, the number of autonomous groups has decreased, and their centralization is increased.
At the beginning of 2016 a number of armed groups was about 30. During the active phase of conflict in April 2014 – December 2015 the total number of illegal armed group members ranged from 19.5 to 52 thousand, and to the end of 2015 it amounted to 35.7 thousand. Noticeable is heterogeneity and uncertainty of this population: more than 60% of the militants are citizens of Russia and other countries, about 15% are Russian military personnel.
For a detailed study of the general population, it is necessary to use additional tools based on stochastic methods of assessment (Ermoliev, Makowski, Marti, 2012). This makes possible to analyze distributions of socioeconomic parameters of members of illegal armed groups.
In this case, we limited ourselves to a narrow concrete case of child soldiers. Calculated distributions of the study parameters are demonstrating following results.
Data were obtained based on the information resources of local communities, their supporters groups, homepages of militants and their families in social networks, and media publications. This allows us to estimate the number of children, their distribution and dynamics, age and gender characteristics, and in most cases, to identify their origin and identity, including determination of residency and schools, to restore the history of their participation in armed conflict.
First case of involvement of children into illegal armed group was detected in town Slovyansk in end of April 2014. Underage personnel were recruited into pro-Russian troop headed by Igor Stryelkov (a.k.a. Girkin), who captured the town. Then, during the fighting in Slovyansk in June 10-30, 2014, the first 2 child soldiers were killed and a few kids were injured. The estimated (by calculation using procedures (1)-(3), (6)) total number of people under the age of 18, who was involved in the conflict as military personnel on the side of illegal groups, during the events in Slovyansk is about 35 - 40.
Later, during 2014 the number of children in the illegal military groups gradually increased. Almost every group had underage participants, which in many cases were involved into the fighting.
The total number of underage soldiers in terrorist groups during the conflict in Donbas in the period April 2014 - January 2015 could be estimated at 210-250 persons. Their age ranges from 13-18, mostly 16-17 years, but registered even cases of 12-year-old kids (Fig.2).
Most (90%) underage members of illegal armed groups in 2014 and in 2015 are boys with an insignificant tendency to increase their part (Fig.3).
In 2015 the number of child soldiers began to decline. The reason for this is likely to organizing the chaotic terrorist groups and reducing its support from Russian Federation. However, local children continue to cooperate with militants using them as a source of income during the economic degradation of the territory. During 2014-2015 the direct subsidy from Russia of conflict region was about 88% of their total budget, while local economic activity (including illegal) accounted for only 12%. At the same time in 2015 losses of underage soldiers are increasing.
Question of involvement of children in the illegal armed groups is also important. In 2014, especially in the period April-October, the majority (80%) of underage soldiers were kids of members of these groups (Fig.4). Their parents (mainly fathers) were also detected among the members of armed groups. About 10% were children who have no parental care (registered by the regional social service or with permanent residency in other localities), and 10% - children from local families who accidentally joined armed groups (kids with local addresses and registered in local schools). Ways of recruitment of children are usually the military-patriotic and sport clubs where children were indoctrinated by specific pro-Russian and pro-Soviet ideas. Soviet symbols of WWII were widely used in propaganda among child.
In 2015 the situation is changing: the proportion of kids of members of armed groups is reduced to 25%, and the proportion of children without parental care increased to 50%. The number of children from ordinary families also increased to 25%. This is due to changes both in the structure of local communities and in the economy of the territory.
In 2015 at the background of the observed influx of mercenaries from Russia, the number of local residents among illegal armed groups as well as local child soldiers is decreased (Fig.5).
With a significant reduction in intensity of fighting, over 2015 children are mainly used by militants for intelligence activity, housekeeping and security functions.
The total number of underage soldiers at the end of 2015 in the Russian-supported illegal armed groups could be estimated at 70-90 persons. This number is not constant, children who are local residents, are not a part of the group permanently. The most stable groups of children - up to 50% of them - are Russian citizens.
Thus, as the data demonstrate, for the period 2014-2015 during the armed conflict in the Donbas (Eastern Ukraine) on the side of the Russian-supported illegal armed groups about 300 underage soldiers (under 18 years) has been involved. About 150-180 child soldiers directly participated in the combats. Up to 50 minors were killed during the fighting.
Estimations, based on calculation procedures (1)-(3) and (4)-(9), are corresponding to the assessments of other sources, such as The 2016 Trafficking in Persons Report.
Social networks are an important and valuable source of information in conflict situations. The social network data analysis could be an important instrument in criminology, conflict research and terrorism study.
However, the specific of big data requires application of correct techniques of data collection processing and analysis. In this context, it is important not only to correctly collect and filter the data, but also to apply the regularization procedure, which will provide regular spatially-temporally distributed data, mutually verified, with controlled reliability.
For data filtering and classification has been proposed an approach based on the Bayes rule for minimum classification error in terms of maximum-a-posterior decision task in Markov random field model representation of multi-temporal, multi-source data. As the result of classification we obtain a dataset with all records that meet the specified condition. For collected, filtered and classified data regularization it has been proposed non-linear kernel-based principal component algorithm (KPCA) based method, modified according to specific of data. Using this algorithm was obtained regularized spatial-temporal distributions of investigating parameters over the whole observation period with rectified reliability and controlled uncertainty, prepared for the interpretation.
Proposed in this paper methodological framework requires some a priori information about the studied distributions. Consequently, methods based on this methodology require regional adaptations.
Thus, application of the proposed method resulting to distribution, which cannot be obtained by another way, at least during the active conflict. We can recognize a calculated data as unique, reliable, and meaningful and has a sense as evidence, rather than fragmentary information from unknown and unconfirmed sources.
Using of children as soldiers and military personnel is a widespread practice of criminal and terrorist regimes worldwide. It is clearly condemned by the international community and recognized as a war crime. According to international law (in particular, Article 77.2 of the Additional Protocol I to the Geneva Conventions of 12 August 1949, and relating to the Protection of Victims of International Armed Conflicts, 1977, and the Optional Protocol to the Convention on the Rights of the Child on the involvement of children in armed conflict, 2002), the participation of children under 15 years in army is prohibited, as well as the recruitment of persons under 18 years to paramilitary and guerrilla nongovernmental groups with any goals. Actually, child soldiers are separate, independent indicator of terrorism. According to international statistics, more than 300,000 children and teenagers were used in the last 10 years as a soldier, a partisan military personnel to support the armed forces in the world. The most famous historical cases of mass use of minors in armed conflict are a children's group “The Liberation Tigers of Tamil Eelam” in Sri Lanka, the young militants of Hamas and "Islamic Jihad" in the Palestinian Authority, "Lord's Resistance Army" in Uganda, left-wing guerrilla groups in Colombia and other.
Unfortunately, in the framework of the conflict in Donbas, the practice of using children as soldiers and military personnel also widely used by Russia supported illegal armed groups mixed from Russian invaders and local collaborators. Mass systematic use of persons under 18 as soldiers and military personnel during the conflict proves that we are dealing with a classic terrorist organization, controlled, managed and supported by Russia.
The authors are deeply grateful to the anonymous referees for constructive suggestions that resulted in important improvements to the paper, to colleagues from the American Statistical Association (ASA) for their critical and constructive comments and suggestions.
Caverlee, J., Cheng, Z., Sui, D. Z., & Kamath, K. Y. (2013). Towards Geo-Social Intelligence: Mining, Analyzing, and Leveraging Geospatial Footprints in Social Media. IEEE Data Eng. Bull., 36(3), 33-41.
Duda, R. O., Hart, P. E., & Stork, D. G. (2012). Pattern classification. John Wiley & Sons
Ermoliev, Y., Makowski, M. and Marti, K. (2012) Managing Safety of Heterogeneous Systems. Springer, Heidelberg. ISBN 978-3-642-22883-4
Geman, S., & Geman, D. (1993). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. Journal of Applied Statistics, 20(5-6), 25-62.
Ho, Y. C., & Kashyap, R. L. (1965). An algorithm for linear inequalities and its applications. IEEE Transactions on Electronic Computers, 5(EC-14), 683-688.
Kostyuchenko Yu.V. (2015) “Geostatistics and remote sensing for extremes forecasting and disaster risk multiscale analysis” in: S. Kadry and A. El Hami (eds.), Numerical Methods for Reliability and Safety Assessment: Multiscale and Multiphysics Systems, Springer International Publishing Switzerland, 2015, XII, 805 p. 328 il., 404-423, DOI 10.1007/978-3-319-07167-1_16, ISBN 978-3-319-07166-4
Kostyuchenko Yu.V. (2016) Risk Perception Based Approach to Analysis of Social Vulnerability in: Risk Perception: Theories and Approaches. eds by Theodore Spencer, NY, Nova Publ., pp. 17-51, ISBN: 978-1-63484-623-3
Kostyuchenko Yu.V., Movchan D. (2015) Quantitative parameter of risk perception: can we measure a geoethic and socio-economic component in disaster vulnerability? in: Peppoloni, S. & Di Capua, G. (eds) Geoethics: the Role and Responsibility of Geoscientists. Geological Society, London, Special Publications, 419. – 2015, First published online Feb 16, 2015, http://dx.doi.org/10.1144/SP419.10
Kostyuchenko Yu.V., Movchan D., Kopachevsky I., Bilous Yu. (2015) Robust Algorithm of Multi-Source Data Analysis for Evaluation of Social Vulnerability in Risk Assessment Tasks // Proc. of SAI IntelSys 2015, Nov 10-11, IEEEXplore, 2015, pp. 944- 949, London, UK, doi 978-1-4673-7606-8/15
Krause, J., Croft, D. P., & James, R. (2007). Social network theory in the behavioural sciences: potential applications. Behavioral Ecology and Sociobiology, 62(1), 15-27.
Lerbinger, O. (2012). The crisis manager. Routledge.
Lerman, K. (2013). Social Informatics: Using Big Data to Understand Social Behavior. In Handbook of Human Computation (pp. 751-759). Springer New York.
Scott, J. (2012). Social network analysis. Sage.
Sultana, M., Paul, P. P., & Gavrilova, M. (2015). Social Behavioral Biometrics: An Emerging Trend. International Journal of Pattern Recognition and Artificial Intelligence, 29(08), 1556013.
The 2016 Trafficking in Persons Report (2016) U.S. Department of State Publications, Office of the Under Secretary of Civilian Security, Democracy and Human Rights, June 2016, 422p.
Wasserman, S., & Galaskiewicz, J. (Eds.). (1994). Advances in social network analysis: Research in the social and behavioral sciences (Vol. 171). Sage Publications.