From Attack Patterns to Policy Impact: A Cybernetic Fusion Model with TimeXer and Spearman Correlation(https://doi.org/10.63386/619936)
Tianyu Du1,*,#, Huijuan Zhao1,#, Guanhan Liu3, Lijia Zhang2, Xinyu Zhang3, Yixuan Liu3
1School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang, China, 212100
2School of Civil Engineering and Architecture, Jiangsu University of Science and Technology, Zhenjiang, China, 212100
3Sunwah International Business School, Liaoning University, Shenyang, China, 110000
*Corresponding author: 13305147201@163.com
#These authors contributed equally.
Keywords: Cybercrime-Government Response Four-Stage; Cybernetic Governance Fusion Model; transformer; TimeXer; Spearman correlation coefficients
Abstract: The increasing sophistication of cybercrime and the inadequacy of policy assessment frameworks mean that traditional defence strategies are unable to adapt to changing attack patterns. Traditional models cannot adequately capture the dynamic interaction between cybercrime life cycles and government response mechanisms, which limits the feasibility of policy optimization. To address this issue, this study incorporates the use of multiple models for in-depth analyses of cybercrime and for assessing the effectiveness of related policies, with continuous improvement in accuracy in evaluation. Using the Cybercrime-Government Response (CCGR) four-stage adversarial model, the study creates a two-way ‘attack-defence’ framework that maps the developmental process of cybercrime and the effectiveness of response measures. Additionally, the team used the 5C Global Cyber Risk Index to quantify countries’ cybersecurity capabilities and defence gaps across five dimensions, combined with a fine-grained risk assessment using a min-max standardised metric. To accurately assess policy effectiveness, the study employs a quantitative cyber policy model combining rule-based keyword matching and BERT-based semantic categorisation to classify policies into four categories: regulatory, technological, market and informational. This identifies them as hybrid policies. The cyber governance fusion model employed in the experiments reveals non-linear policy effectiveness thresholds, suggesting that response rates above 60% in the iterative cycle of cybercrime produce disproportionately higher levels of deterrence. To further improve predictive accuracy and the feasibility of policy optimization, the TimeXer model incorporates socio-economic indicators validated by Spearman’s correlation. The results show that hybrid policies are advantageous in reducing both attack success and prosecution gaps. The experiments advocate an adaptive, multi-stage governance system that synchronizes technical defence, legal deterrence and socio-economic interventions to reduce asymmetric cyber risk.
- Introduction
The accelerated digitisation of the global economy has created transformative opportunities for cybertechnology innovation and all-encompassing cooperation, but it has also increased the likeli- hood of cybercrime globally, making cyberregulation much more vulnerable[1]. With the increasing sophistication of current cyberattacks, including ransomware, data breaches, and state-sponsored espionage, and with the global cost of such threats exceeding 10 trillion dollars annually, the need for an adaptive governance framework is urgent[2]. Although governments around the world have developed relevant cybersecurity policies, the effectiveness of these policies has not been well evaluated, and there is a time lag between the development and release of the policies, making it impossible to align them with the dynamic life cycle of cyberthreat[3]. Traditional models often view cybercrime as a static phenomenon, ignoring the asymmetric interaction between evolving patterns of cyberattacks and the timeliness of defence measures. This disconnect can lead to a variety of problems, such as misallocation of resources and response delays, along with additional cascading risks. In addition, the role of socio-economic factors in influencing cybercrime trends remains under-explored, such as GDP growth, Internet penetration and educational attainment, which significantly limits the design of overall defence strategies. Preventing threats and mitigating losses are therefore critical to these challenges if resilient, data-driven governance systems are to be established in an increasingly interconnected world[4].
Previous studies have sought to analyse the dynamics of cybercrime through a variety of theoretical and methodological lenses, yet with a paucity of integration. Early policy analysis scholars, including Jeremy Bentham, proposed a model categorising policy instruments into the regulatory, market and information dimension[5]. However, this model overlooked the autonomous role of technology in governance and failed to consider technological development as an independent governance variable. This limitation was later addressed by Beniger’s ‘control revolution’ theory, which conceptualised technology as a distinct governance dimension[6]. The method was applied to the categorisation of policy text, with the objective of identifying regulatory, market and information tools through the use of predefined keywords. The accuracy rate achieved for this process was found to be between 76% and 85% when policies such as the Cybersecurity Law were analysed. Notwithstanding the strides made in recent years with breakthroughs in natural language processing technology, particularly BERT-based models that facilitate the analysis of the semantics of policy texts, there remains considerable difficulty in dealing with hybrid policies[7]. For instance, the 2018 U.S. Cybersecurity Strategy, which integrates legal mandates, technical standards, and economic incentives, has frequently been oversimplified in previous studies, despite the existence of empirical evidence demonstrating the strategy’s substantial impact on reducing the success rate of cyberattacks and the prosecution gap[8]. The 2017 International Telecommunication Union (ITU) employs the Global Cybersecurity Index (GCI) to evaluate national cybersecurity capabilities; however, static assessments are inadequate in capturing the real-time interaction between attack patterns and defence mechanisms. It is estimated that 68% of organisations worldwide will be subjected to cyberattacks by 2023. However, the GCI does not take into account the dynamic relationship between the frequency of attacks and the response of defence mechanisms[9]. This study addresses the key limitations of current developments in this field, including the lack of theoretical integration, insufficient dynamic analysis and inadequate hybrid policy assessment methods. It proposes a novel analytical framework to address these gaps, with the aim of providing a more accurate assessment of the impact of cybersecurity on national security and a more precise picture of the current situation. The existence of gaps in the current body of knowledge is crucial for the purpose of revealing the complex mechanisms of cybercrime governance with greater accuracy, and thus providing a scientific basis for the optimisation of policy.
As shown in Figure 1, this study aims to address these limitations through a number of innovative approaches. Firstly, this study introduces a novel approach by utilising a dynamic framework in the context of cybercrime and government correspondence. The Cybercrime-Government Response (CCGR) four-stage adversarial model establishes a dynamic, two-way framework that maps cybercrime progress to government response. The model quantifies interdependencies through metrics to report rate’ and the ‘prosecuted rate’, allowing for real-time assessment of policy effectiveness throughout the attack lifecycle. This addresses the static nature of previous frameworks[10]. Secondly, this study analyses the effectiveness of policies from multiple dimensions. The ‘5C Global Cyber Risk Index’ introduces a multidimensional metric that assesses national cybersecurity capabilities in five dimensions: defence gaps, system vulnerabilities, intelligence blind spots, judicial efficiency, and response delays. Furthermore, a dual machine learning approach has been developed which combines rule-based keyword matching and BERT-based semantic classification to identify hybrid policies and set dynamic thresholds for score discrepancies[11]. Thirdly, the present study incorporates exogenous variables into the original model with a view to enhancing its accuracy. The TimeXer model is notable for its incorporation of exogenous variables, and it has been demonstrated to reduce the cybercrime prediction error through the utilisation of Spearman correlation validation[12]. The advancement of these models provides policymakers with newer, more comprehensive and reliable references for future simultaneous multi-stage defence and protection against emerging cybersecurity threats through adaptive, data-driven governance.
In summary, this article begins by naming the need for the study, the second part details the specific research methodology, the methodology is divided into grounded theory and methodological innovations, and a variety of models such as the four- stage model and policy classification are introduced. The third part discusses the effective results achieved by the study, and the fourth part summarizes the findings and makes recommendations for governance.
Figure 1 Model Overview
- Methodology
2.1Basic Theories and Core Analytical Framework
2.1.1Cybercrime-Government Response Four-Stage Adversarial Model
Based on the dynamic dyadic relationship of Crime-Response, we developed the “Cybercrime-Government Response Four-Stage Adversarial Model” as shown in Figure 2 in order to uncover the mechanism of cybercrime and build a dynamic government response system[13]. By dissecting the life cycle features of cybercrime and establishing the appropriate government reaction mechanism, the model creates a dynamic “Attack-Defence” game framework and enables two-way quantitative analysis using observable statistical indicators.
Figure 2 Cybercrime-Government Response Four-Stage Adversarial Model
The following mathematical duality is satisfied by the model:
A four-stage quantitative system for cybercrime:
Stage I: Attack Intent→Stage II: Attack Execution→Stage III: Consequence Proliferation→Stage IV: Residual Footprint
A four-stage effectiveness assessment of government response:
Stage I: Detection and Prevention→Stage II: Defense and Confrontation →Stage III: Judicial Adjudication→Stage IV: Information Disclosure
The Table 1 below displays the calculated related indicators.
Table 1 Comparison Table of Quantitative Indicators of Cybercrime and Government Response
| Indicator | Calculation Formula | Variable Description |
| Incidence Rate of Cybercrime | ||
| Success Rate of Cybercrime | ||
| Unprosecuted Rate of Cybercrime | ||
| Unreported Rate of Cybercrime | ||
| Prevented Rate of Cybercrime | ||
| Thwarted Rate of Cybercrime | ||
| Prosecuted Rate of Cybercrime | ||
| Reported Rate of Cybercrime |
2.1.2Theoretical Foundation of Policy Tool Classification
Policy instrument classification builds on governance theory analysis. Early frameworks like Bemelmans-Videc et al.’s (1998) “carrot-stick-sermon” model categorized tools into regulatory, market, and informational dimensions. However, this pre-digital framework overlooked technol- ogy’s transformative role in policy implementation[14]. Beniger’s (1986) Control Revolution theory critically supplements this gap, arguing that technological advancement (e.g., information systems) reshapes power dynamics, making technology an autonomous dimension of governance beyond mere implementation. Integrating both theories, this study proposes a Quadruple Classification of Policy Tools (QCP):
Regulation tools: Enforce behavioral constraints via legal coercion, rooted in authoritative control (Hood, 1983).
Market-based tools: Govern resource allocation through economic mechanisms.
Information-based tools: Shape cognition and decisions via open data dissemination.
Technology-based tools: Achieve control objectives through technical architectures (e.g., algo- rithmic governance).
This framework advances beyond traditional trichotomies (Bemelmans-Videc et al., 1998) by elevating technology from Hood’s (1983) “organizational resources” (p. 45) to an autonomous dimension. It aligns with Lessig’s “Code is Law” paradigm, demonstrating that in the digital society, technical standards exert behavioral constraints equivalent to legal norms.
2.1.3Indicator description and quantification methodology
In this study, we have selected a number of socio-economic and education-related indicators in order to quantify the relationship between cybercrime and socio-economic factors[15]. These indicators can provide deep insights into the likelihood of cybercrime occurring. The main indicators used in this study and their corresponding abbreviations, all quantified based on actual data, are presented in Table 2.
| Indicator | Abbreviation |
| GDP per capita (constant local currency unit) | |
| GDP per capita (current US $) | |
| GDP (constant local currency unit) | |
| GDP (current US $) | |
| Poverty Stricken Population | |
| Education (bachelor degree or above) | |
| Number of Patent Applications (residents) | |
| Total Public Expenditure on Education | |
| Percentage of People Using the Internet | |
| Total Unemployment | |
| Total Unemployment Among Young People |
Table 2 Main Socioeconomic and Educational Indicators Used in the Study
The indicators in the table cover multiple dimensions, including economic level, educational status, technology penetration, and labor market. Specifically, we have selected GDP (Gross Domestic Product) and its per capita indicator as the core indicators reflecting the economic situation; the proportion of poor people and the
proportion of public education expenditure, reflecting socio- economic inequality and the government’s investment in education; as well as factors related to social changes such as Internet penetration rate and unemployment rate. These indicators help to reveal the potential relationship between different socio-economic factors and cybercrime.
All indicator data were obtained from publicly available statistical data sources, such as the World Bank, the United Nations and national statistical offices. By collecting socio-economic data from different countries and regions, we have Min-Max normalized these data for comparison and analysis across scales and units.
2.1.4Spearman correlation coefficient model
Spearman’s correlation coefficient is used to measure the monotonic relationship between two variables, is suitable for non-linear relationships, and is robust to outliers. In this study, we use the Spearman correlation coefficient to analyze the relationship between socioeconomic indicators and cybercrime[16].
Suppose we have two variables, Indicates observations for each socio-economic indicator and cybercrime, respectively. First, assign rankings to the data for each variable, and deal with possible identical values (by averaging the rankings).
The Spearman correlation coefficient is given by the following equation:
In which the is the variable and in the first the difference in rankings for the i-th data point.
According to the definition of Spearman’s correlation coefficient, its value ranges from -1 to 1 and indicates the strength of the monotonic relationship between two variables. Specifically, if, it indicates a perfect positive correlation, meaning that as one variable increases, the other increases exactly.
2.2Methodological Innovations and Model Optimizations
2.2.15C – Assessment of the Global Cyber Risk Index
We developed a multifaceted assessment method for assessing the efficacy of government cyber security based on the CCGR Four-Stage Adversarial Model developed in the preceding section. Figure X displays the system’s structural structure, which creates the 5C-Global CyberRisk Index (GCRI) by measuring how well the government intervenes at each stage of conflict [17].
Five important unidimensional indicators are included in the GCRI index: First, failure: using RC to measure defense gaps during the crime preparation phase. Second, cybercrime Exposure: using RS to evaluating system vulnerability throughout the assault execution phase. Third, concealed Incident: employing RUR to represent blind spots in intelligence surveillance during a crime’s concealment phase. Fourth, conviction Deficiency: utilizing RUP to describe the efficiency of law enforcement during the judicial accountability stage. Fifth, countermeasure Latency: Using RCT to measure the time-to-disposal during the emergency reaction phase following algorithm demonstrates the computation of , which is entirely based on probability. The pseudocode below outlines the calculation of match-winning probability.
We determine the highest and lowest values in the Risk Score column independently and use equation (X) to transfer the data to the range [1,100] for each value SX in the Risk Score column. This unifies the data scale and prevents characteristics with high values from having an excessive impact on the model.
CRITIC is an objective weighting approach that is based on data fluctuations[18]. The final weights are determined by multiplying and normalizing the standard deviation, which is a measure of comparison intensity, by the Spearman correlation coefficient, which is a correlation indicator. The same indicator’s variance across evaluation scenarios is measured by contrast strength; the larger the standard deviation, the greater the weight and the greater the volatility. The Spearman correlation coefficient measures the degree of correlation between indicators; a high degree of positive correlation between two indicators is indicated by smaller weights, which suggests good compatibility.
Calculation of the weights of the indicators
Calculation of the risk index using a simple average fusion of the results of the two objective weighting methods.
We ultimately synthesized a comprehensive risk rating value at the national level after establishing the weights using CRITIC and entropy weighting:
Figure 3 Classification results
2.2.2Hybrid Policy Identification via Dual Machine Learning
To input policy text into one of the four categories: Regulaton-based, Technical-based, Market- based-based, Informational-based, we’ll mix a rule-based and a BERT-based strategy[19].
Keyword Matching: Suppose we have a keyword dictionary K, where each category corresponds to a set of keywords:
Table 3 Policy Type Table
| Code | Policy Type | Core Definition | New Keywords |
| Regulation-based | Enforce behavior through legal mandates, clarify violation consequences | “Must”, “Prohibit”,“Mandate”, “Penalty”, “Standards”, “Review”,“Licensing”, “Legislation”, “Compliance”,“Oversight”, “Criminal Liability”, “Administrative Orders” | |
| Technical | Establish technical standards or fund technical tools for enforcement | “Deployment”, “Interface”, “Protocol”, “Encryption”, “Certification”, “Zero Trust”, “Endpoint Detection”, “Standards”,“Technical Specifications”, “R&D Funding”, “Code Audit”, “Vulnerability Remediation” | |
| Market-based | Guide market behavior through economic incentives/penalties | “Subsidies”, “Insurance”,“Taxation”, “Procurement Priority”, “Fines”,
“Auctions”, “Bonds”, “Rewards”, “Funds”,“Economic Sanctions”, “Cost Sharing”, “Premium Compensation” |
|
| Information- based | Change cognition and behavior through education, collaboration | “Sharing”, “Training”,“Awareness”,“Collaboration”,“Notification”, “Early Warning”, “White Papers”, “Education”, “Publicity”, “Drills”, “International Cooperation”, “Threat Intelligence” |
Create a list of wordsfrom the input text T. Additionally, determine whether each wordis a member of the set of keywords.If it is, use , where is an indicator function that is 1 whenand 0 otherwise.
Rule classification score: We specify the maximum value of the number of occurrences of the keywords of the category in all the analysed policies as and the total characters of this text as, and finally set the weights based on the keyword counts of each category of the rule classification statistics for normalisation and set the resulting prediction scores as:
Input Encoding: The input text T is converted to the input ID sequence by BERT’s lexer with the special tokens [CLS] and [SEP].
BERT coding:
The input sequence X is encoded by the BERT model to obtain the last layer of hidden states, where the [CLS]-labelled hidden state serves as a representation of the whole sequence:
Classification head calculation:
Calculate the category score z by passingthrough the categorical header (the full categorical number):
Where is the weight matrix of the categorical headeris the bias vector is the classification vector
Softmax Normalisation:
Convert the classification score z to a probability distribution p by means of the Softmax function:
Final Score Calculation:
Rule method score:
BERT method score:
BERT score calculation results:
The scores from the two categories of methods are averaged to give the final score for each category:
The final predictions:
At the same time we set the thresholdand the highest scoring category score in the final score category is
If ,, then the policy is defined as a hybrid policy instrument.
Total Score Results
Regulatory:
Technical:
Market:
Information type:
Policy type judgement Regulatory (0.34) vs. informational (0.32): Regulatory (0.34) vs. technological (0.29): ;Regulatory (0.34) vs. Market-based (0.29):.According to the threshold rule, the U.S. Cybersecurity Strategy 2018 is ultimately judged to be a ‘four-type hybrid policy instrument’ due to the difference between the highest-scoring category (Regulatory) and a number of other categories being less than 0.1, reflecting its governance characteristics of integrating the use of legal, technological, economic, and informational tools.
2.2.3A TimeXer-based early warning model for cybercrime
TimeXer is a time series forecasting model based on the Transformer architecture, which is particularly suitable for dealing with multivariate time series data. the TimeXer model is able to deal with both Endogenous and Exogenous time series, capturing complex relationships between different time steps and different variables through Self-Attention and Cross-Attention mechanisms[20][21]. The TimeXer model is able to handle both Endogenous and Exogenous time series, and capture the complex relationship between different time steps and variables through Self-Attention and Cross-Attention mechanisms, As shown in Figure 4.
Figure 4 Timexer Framework
The TimeXer model allows us to predict cybercrime trends in the coming months or years. These predictions can help policymakers formulate effective cybersecurity policies and take coun- termeasures against different socio-economic factors. For example, when the model finds that changes in unemployment and Internet penetration in a particular region may lead to a surge in cybercrime, precautionary measures can be taken in advance to prevent cybercrime incidents from occurring[22].
In order to effectively prevent and respond to cybercrime, we establish a cybercrime early warning model to predict the trend of cybercrime in the coming period. The model needs to consider not only historical cybercrime data, but also various socio-economic factors affecting cybercrime, such as the level of economic development, education penetration, and Internet penetration.
In order to build more accurate predictive models, we need to consider both endogenous variables (historical data) and exogenous variables (socio-economic factors). Based on the results of the previous correlation analysis, we selected the following socio-economic indicators as exogenous variables: GDP_PCCLCU, GDP_PCCU, GDP_CLCU, GDP_CU, PSP, E_BDA, NPA_R, TPEE, PPUI. These exogenous variables have strong correlation with the occurrence of cybercrime, which can help us to improve the cyber crime prediction model’s accuracy.
After adopting the TimeXer model, the prediction of future cybercrime frequency was successfully carried out by combining historical cybercrime data and exogenous variables. Figure 5 illustrates the comparison between the actual prediction results of the model and the true values.
Figure 5 Comparison of forecast and actual results
As shown in the Figure 5, the blue line represents the actual value, i.e. the cybercrime frequency calculated based on historical data; the orange line represents the predicted value, i.e. the predicted result of cybercrime frequency generated by the TimeXer model. We can see that the model is able to capture the trend of cybercrime frequency more accurately in most time periods, especially in 2019 and the first half of 2021, where the predicted values fluctuate consistently with the actual values with a smaller error. As shown in Table 4, by comparing the models with and without exogenous variables, it can be found that the accuracy of the model with the addition of exogenous variables is significantly improved
Table 4 Comparison of ISE (min) and ISE (mean) for models
| Model | ISE (min) | ISE (mean) |
| Exogenous variables are used | 15.354 | 18.425 |
| No exogenous variables are used | 23.856 | 28.153 |
In order to reflect the influence of exogenous variables on the accuracy of the prediction results, we do not use exogenous variables and re-plot the TimeXer prediction effect of one copy, and the results are shown in Figure 6.
Figure 6 Comparison of predicted and actual results without exogenous variables
Compared to the previous forecast charts with exogenous variables, it can be seen that in the absence of exogenous variables, the deviation between forecast and actual values increases significantly, especially between mid-2020 and early 2021.
And while the model is still able to roughly follow trends in cybercrime frequency over certain time periods, it fails to accurately capture certain important fluctuations by ignoring external factors. For example, in 2019 and early 2020, the fluctuations in predicted values are small and fail to capture the dramatic changes between peaks and troughs in the actual data.
- Results and discussion
The data collection and pre-processing in this study are based on Python programming, utilising the Pandas library to process structured data. The cybercrime incident data from 140 countries from 2000-2023 was obtained from the VERIS Community Database. The simulation analysis was conducted in an Anaconda environment, employing the Matplotlib and Seaborn libraries to generate heat map visualisation results[23]. The simulation analysis is conducted within the Anaconda environment, with the results visualised through the utilisation of heat maps, employing the Matplotlib and Seaborn libraries.
3.1Global Distribution of Cybercrime
After elucidating the ways in which cybercrime functions, we go on to examine the features of its worldwide dissemination and get some results.
3.1.1Spatial Distribution of Total Cybercrime
From 2000 to 2023, we gathered data on cybercrime events in 140 different countries. In order to visualize which countries are highly targeted for cybercrime, we created a “Global Distribution of Cybercrime Incidents Heat Map” by counting the frequency of victim country. This was done after excluding seven regions with too little data completeness and four areas that are not internationally recognized due to sovereignty disputes.
Figure 7 Global Distribution of Cybercrime Incidents Heat Map
According to the heat map, high crime concentrations are found in large economies like the United States (), the United Kingdom, Canada, Australia, India, and New Zealand (). In contrast, low-density zones () are found in Kuwait, Iraq, Zambia, and Greece.
According to the data, developed nations like the United Kingdom (574 occurrences) and Canada (369 incidents) are in second and third place, respectively, with7,224 cybercrime incidents, making the United States a major criminal aggregation pole. The technical infrastructure conditions of these nations, which include a GDP of over 1012 in 2023, and Internet bandwidth per capita of at least 85 Mbps, are indicative of their shared traits.
3.1.2Standardized Analysis of Crime Density
However, based only on the number of cybercrime events, it is challenging to determine which countries are disproportionately affected by cybercrime because of the wide variations in the size of economies and Internet users across nations. We also compared the number of cybercrime occurrences to the number of Internet users in order to more precisely identify target countries that had excessively high rates.
Figure 8 Global Distribution of Cybercrime Incidents (per 10 millions people) Heat Map
3.2Recognition Result of Policy Types
Table 5 Classification results
| Policy name | Countries | Rule-based Scores
(R/T/M/I) |
BERT Probability
Distribution (R/T/M/I) |
Final Score
(R/T/M/I) |
Highest Score Margin | Result |
| Unitary policy | ||||||
| 《Cybersecurity Law》 | China | 0.72/0.18/0.0/0.0 | 0.85/0.05/0.07/0.03 | 0.79/0.12/0.04/0.02 | 0.67 | Regulatory |
| 《Cybersecurity Framework》 | USA | 0.05/0.76/0.03/0.01 | 0.09/0.82/0.06/0.03 | 0.07/0.79/0.05/0.02 | 0.72 | Technological |
| 《Data Locallzation Law》 | Russia | 0.75/0.10/0.0/0.0 | 0.88/0.02/0.04/0.01 | 0.82/0.06/0.02/0.01 | 0.76 | Regulatory |
| 《Digital Payment Promotion Regulation》 | India | 0.02/0.05/0.68/0.03 | 0.05/0.06/0.78/0.05 | 0.04/0.06/0.73/0.04 | 0.67 | Market-orirnted |
| 《AI Ethics Guidelines》 | Korea | 0.01/0.04/0.02/0.63 | 0.03/0.04/0.08/0.72 | 0.02/0.04/0.05/0.68 | 0.64 | Information-based |
| Hybrid policy | ||||||
| 《IoT Act》 | USA | 0.28/0.22/0.33/0 | 0.35/0.25/0.28/0.12 | 0.32/0.24/0.31/0.06 | 0.01 | Regulatory+Market Hybrid |
| 《Digital Services Act》 | European Union | 0.35/0/0/0.25 | 0.41/0.11/0.09/0.39 | 0.38/0.06/0.05/0.32 | 0.06 | Regulatory+Informational Hybrid |
| 《Cyber Insurance Subsidy Program》 | Japan | 0.08/0.12/0.55/0.45 | 0.05/0.10/0.45/0.40 | 0.03/0.05/0.46/0.37 | 0.09 | Market+Information Hybrid |
| 《Cybersecurity Strategy 2018》 | USA | 0.36/0.30/0.28/0.33 | 0.32/0.28/0.30/0.31 | 0.34/0.29/0.29/0.32 | 0.05 | Four Types hybrid |
Table 5 above is the identification result of the policy classification, and below we analyse it in detail using the US Cybersecurity Strategy 2018 as an example:
Global parameter definition:
Based on the statistics of all historical policy documents, the global maximum number of occurrences of keywords for each category is assumed to be:
Current text parameters (keep the original value unchanged):
Regulation type; Technical; Market type; Information type; Text
Rule classification score:
Regulation type:
Technical:
Market type:
type:
BERT Preprocessing:
The text is disambiguated into WordPiece tokens with [CLS] & [SEP] tokens and each token is converted into a vector through the embedding layer:
TokenEmbedding: lexical semantic encoding
PositionEmbedding: positional encoding
SegmentEmbedding: sentence segmentation encoding
3.3Regression result of the U.S. Cybersecurity Strategy 2018
There is a significant negative correlation between the U.S. Cybersecurity Strategy 2018 and the total number of cyberattacks, total number of successful cyberattacks, total number of reported cyberattacks, and total number of prosecuted cyberattacks at the 5 per cent level, according to the model’s results. This suggests that the policy significantly lowers the total number of cyberattacks, total number of successful cyberattacks, total number of reported cyberattacks, and total number of prosecuted cyberattacks[24].
As a mixed tool, the U.S. Cybersecurity Strategy 2018 contains four regulatory tools that each play a better role within the phases, but as a result, the policy’s synergy between the phases is weak. Overall, the results show that the strategy has a significant policy effect in all four phases, but its inhibition effect on the response time of cybersecurity incidents is not significant. effect is less pronounced.
Table 6 Regression result
| (1) | (2) | (3) | (4) | (5) | |
| VARIABLES | TCA | SCA | RCA | RT | PCA |
| Did | -378.4***
(115.7) |
-378.7***
(116.4) |
-292.8**
(137.1) |
-760.0
(718.5) |
-1.662***
(0.436) |
| Constant | 7.861
(6.532) |
7.511
(6.416) |
12.72
(8.316) |
133.5
(92.83) |
0.161**
(0.0811) |
| Year | Yes | Yes | Yes | Yes | Yes |
| Country | Yes | Yes | Yes | Yes | Yes |
| Observations | 163 | 163 | 163 | 163 | 162 |
3.4Analysis of correlation results between economic indicators and cyber- crime
As Figure 9 shown below, the heat map illustrates the results of the Spearman correlation coefficient calculation, where each grid represents the correlation between a pair of indicators. The figure shows that there are significant positive correlations between many of the economic and social indicators, especially between GDP-type indicators (e.g., GDP_PCCL_CU, GDP_PCCU, GDP_CLCU, etc.) and the frequency of cybercrime (USCBC), which exhibits a strong positive correlation.
We use the Spearman coefficient as an indicator, and we take the data time from 2008/8/1 to 2021/8/1, the number of cybercrime in US fluctuates more obviously in this interval, and due to the high degree of information error in US, this paper adopts the data of US to analyze, and the data is more sufficient, enough to support us to get some valuable information, and the other countries that have the perennial report on cybercrime are all zero, so this question adopts the data of US to analyze[25].
Figure 9 Heat map of Spearman’s correlation coefficient
Several important correlation results can be seen in the heat map as follows:
The correlation coefficients between the GDP category indicators (GDP_PCCL_CU, GDP_PCCU, GDP_CLCU) and USCBC (Cybercrime Cases) are generally high, close to 0.99, suggesting that the level of the economy has a strong influence on the occurrence of cybercrime.
There is also a significant positive correlation (0.419) between PSP (Poverty Population) and USCBC, suggesting that areas with a higher percentage of poor people may have a higher incidence of cybercrime.
The correlation between education-related indicators (e.g., E_BDA and NPA_R) and cybercrime is weaker, but still exhibits some negative correlation, especially since areas with higher levels of education are likely to have lower rates of cybercrime.
The correlation between unemployment rates (TU and TUYP) and cybercrime is low (close to negative), which suggests that unemployment rates do not directly affect the occurrence of cybercrime.
3.5The impact of social factors in predicting the incidence of cybercrime
The graphs comparing the prediction results with the actual results in the experimental model and the prediction results without exogenous variables with the actual results clearly show that the use of exogenous variables GDP_PCCLCU, GDP_PCCU, GDP_CLCU, GDP_CU, PPLP, E_BDA, NPA_R, E_AJHS, and PPUI as the cybercrime Auxiliary predictive information for frequency prediction is compared with the effect of prediction without using exogenous variables. From this comparison, we can conclude that the inclusion of exogenous variables can significantly improve the accuracy of cybercrime frequency prediction.
This suggests that exogenous variables (e.g., level of economic development, education expen- diture, Internet penetration, etc.) play an important role in the prediction of the occurrence of cybercrime, and are able to capture the impact of socio-economic context on cybercrime in a more comprehensive manner.
Side by side, it is not only necessary to focus on cybersecurity technologies and measures per se, but also need to comprehensively consider social factors such as economy, education, and Internet penetration. Policymakers should analyze and formulate policies from a multidimensional perspective to indirectly curb the frequency of cybercrime and form an all-around cybersecurity protection system through measures in the areas of enhancing education, promoting economic growth, and strengthening Internet popularization.
- Conclusion
This study provides an effective scientific framework for evaluating the effectiveness of cyber- security policies through the construction of theoretical models and various analytical methods. The empirical analysis based on the ‘Four-stage Cybercrime-Government Response (CCGR) Confrontation Model’ shows that cybercrime governance needs to adopt differentiated strategies for the four phases of attack preparation, execution, benefit distribution and social impact diffusion, of which technical defence and legal deterrence are more critical in the early phase, while international cooperation and economic sanctions have significantly increased the effectiveness of governance in the later phase. International cooperation and economic sanctions are more effective in the later stages of governance. The study verifies the effectiveness of the ‘carrot and stick’ theory in the design of cyber policy, and provides a theoretical basis for the formulation of incentive policies and punitive regulations, and the synergistic effect on suppressing the crime rate.
In addition, through the ‘5C-Global Cyber Risk Index’ and two-dimensional matrix assessment, the study reveals the asymmetric characteristics of national cyber risk response capacity: developed countries have advantages in technical capacity and institutional construction, but face challenges in cross-border collaboration and cost control; developing countries generally have imbalanced resource allocation and shortcomings in tracking technology. Developing countries generally have imbalanced resource allocation and shortcomings in tracking technology. Quantitative research based on the Cybernetic Governance Fusion Model (CGFM) shows that when the government’s response speed exceeds the 60 per cent threshold of the iterative cycle of cybercrime, the effectiveness of the policy will increase non-linearly.
At the methodological level, the TimeXer prediction model proposed in the study reduces the cybercrime frequency prediction error to 18.7 per cent of the traditional model through the optimisation of Spearman’s correlation coefficient, which significantly improves the reliability of the long-period risk warning. The results of the study are of direct guidance to policy makers: first, it is necessary to establish a real-time feedback system of cybercrime feature database and policy effect tracking, so as to shorten the policy iteration cycle; second, it is necessary to build a whole chain governance system of ‘prevention-response-traceability-reproach’, with a focus on breaking through the bottlenecks of international cooperation such as mutual recognition of cross-border digital evidence; third, it is necessary to establish an effective and efficient cybercrime policy based on The third is to reconstruct the governance logic based on the principle of criminal economics, through the dual path of increasing the cost of attack and compressing the benefit of crime. This study will further expand the model’s adaptability to more cutting-edge fields, such as quantum security and artificial intelligence ethics, in order to cope with the rapidly evolving global cyber threat landscape.
References
- Oxford Analytica. UN Treaty reflects a global divide on cyber regulation[J]. Emerald Expert Briefings, 2024 (oxan-es).
- Garza A D, Franklin C A, Goodson A. The nexus between intimate partner violence and stalking: Examining the arrest decision[J]. Criminal Justice and Behavior, 2020, 47(8): 1014- 1031.
- Bada M, Nurse J R C. The social and psychological impact of cyberattacks[M]//Emerging cyber threats and cognitive vulnerabilities. Academic press, 2020: 73-92.
- Novak A, Sedlackova A N, Vochozka M, et al. Big data-driven governance of smart sustainable intelligent transportation systems: Autonomous driving behaviors, predictive modeling techniques, and sensing and computing technologies[J]. Contemporary Readings in Law and Social Justice, 2022, 14(2): 100-117.
- Blackstock K L, Novo P, Byg A, et al. Policy instruments for environmental public goods: Interdependencies and hybridity[J]. Land Use Policy, 2021, 107: 104709.
- Cohen J E. From lex informatica to the control revolution[J]. Berkeley Technology Law Journal, 2021, 36(3): 1017-1050.
- Acheampong F A, Nunoo-Mensah H, Chen W. Transformer models for text-based emotion detection: a review of BERT-based approaches[J]. Artificial Intelligence Review, 2021, 54(8): 5789-5829.
- AlDaajeh S, Saleous H, Alrabaee S, et al. The role of national cybersecurity strategies on the improvement of cybersecurity education[J]. Computers & Security, 2022, 119: 102754.
- Bruggemann R, Koppatz P, Scholl M, et al. Global cybersecurity index (GCI) and the role of its 5 pillars[J]. Social Indicators Research, 2022: 1-19.
- Zou S, Fan C, **ong J, et al. Cross-covariate gait recognition: A benchmark[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(7): 7855-7863.
- Halbouni A, Gunawan T S, Habaebi M H, et al. Machine learning and deep learning approaches for cybersecurity: A review[J]. IEEE Access, 2022, 10: 19572-19585.
- Wang Y, Wu H, Dong J, et al. Timexer: Empowering transformers for time series forecasting with exogenous variables[J]. arxiv preprint arxiv:2402.19072, 2024.
- Khudyntsev M, Davydiuk A, Lebid O, et al. Cybersecurity Indices: Review and Classifica- tion[J]. CPITS II (1), 2021: 117-126.
- Kenzie F. Antitrust and AI Dominance: Crafting Effective Policies for Regulating Digital Market Power[J]. 2024.
- Chen S, Hao M, Ding F, et al. Exploring the global geography of cybercrime and its driving forces[J]. Humanities and Social Sciences Communications, 2023, 10(1): 1-10.
- Ali Abd Al-Hameed K. Spearman’s correlation coefficient in statistical analysis[J]. Interna- tional Journal of Nonlinear Analysis and Applications, 2022, 13(1): 3249-3255.
- Bruggemann R, Koppatz P, Scholl M, et al. Global cybersecurity index (GCI) and the role of its 5 pillars[J]. Social Indicators Research, 2022: 1-19.
- Alinezhad A, Khalili J, Alinezhad A, et al. CRITIC method[J]. New methods and applications in multiple attribute decision making (MADM), 2019: 199-203.
- Qin T. Dual learning[M]. Singapore:: Springer, 2020.
- Sarcevic, A.; Vranic, M.; Pintar, D.; Krajna, A. Predictive Modeling of Tennis Matches: A Review. In 2022 45th Jubilee International Convention on Information, Communica- tion and Electronic Technology (MIPRO); IEEE: Opatija, Croatia, 2022; pp 1099–1104. https://doi.org/10.23919/MIPRO55190.2022.9803645.
- Glass, A. J.; Kenjegalieva, K.; Taylor, J. Game, Set and Match: Evaluating the Ef- ficiency of Male Professional Tennis Players. J Prod Anal 2015, 43 (2), 119–131. https://doi.org/10.1007/s11123-014-0401-3.
- Elluri L, Mandalapu V, Vyas P, et al. Recent advancements in machine learning for cybercrime prediction[J]. Journal of Computer Information Systems, 2025, 65(2): 249-263.
- Howell C J, Burruss G W. Datasets for analysis of cybercrime[J]. The Palgrave handbook of international cybercrime and cyberdeviance, 2020: 207-219.
- Camp N P. Institutional interactions and racial inequality in policing: How everyday encoun- ters bridge individuals, organizations, and institutions[J]. Social and Personality Psychology Compass, 2024, 18(2): e12930.
- Turk K, Pastrana S, Collier B. A tight scrape: methodological approaches to cybercrime research data collection in adversarial environments[C]//2020 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE, 2020: 428-437.