Welcome to General-purpose AI poses some significant risks. How can we mitigate them?.

In the past two articles (Potential risks from General-purpose AI systems: Part 1- Risks, Potential risks from General purpose AI systems- Part II: Systemic Risks), we discussed the risks of general-purpose AI. These risks are clear and significant, and it is possible that some people are already experiencing them. How about we look at some mitigation strategies? It is challenging to manage general-purpose AI — risk management in this context is identifying, assessing, mitigating, and monitoring the risks identified. Press enter or click to view image in full size

Why is risk management for general-purpose AI complex? 1. Broad use case for general-purpose AI.

The uses of general-purpose AI are broad. While you are using it to find a good recipe for a novel pancake, a medical student uses it to explain a diagnosis and a physicist uses it to break down a Newtonian theory. This wide range of use cases, including the generation of video or simulation products, increases the difficulty of comprehensively anticipating the relevant use cases, identifying the risk, or testing how the system will behave in a relevant real-world circumstance. It is difficult to determine what the user will use the general-purpose AI for and help address the associated risks. 2. There is little or no Model explainability.

Developers still understand little about how the general-purpose AI models operate. When it is challenging to tell how the model operates, predicting the behavioral issues or resolving the unknown matters when they are observed becomes challenging. The shift from traditional programming has shifted understanding of these models, making them elusive. General-purpose AI models are trained using large volumes of data, resulting in a challenge in scrutinizing the inner workings of the models. While this can be increased with model explanation and interpretability techniques, the research into the two concepts remains nascent. 3. AI agents

AI agents, which are General purposes AI systems that can autonomously act, plan, delegate, or achieve goals, present new significant challenges for risk management. AI agents use general software to search, schedule, and program to accomplish their tasks. As they continue to become useful for many different sectors and industries, they may also exacerbate several risks. One possible challenge is that users might not always know what their AI agents are doing. This potential to act and operate outside anyone’s control could make it easy for attackers to hijack agents and instruct them to do something else malicious. Additionally, there are chances with everyone using AI agents, for AI agents to interact with each other, creating complex new risks. Minimum approaches have been developed to manage risks associated with AI agents. 4. Evidence dilemma

One of the significant issues with addressing general-purpose AI risks is the pace of advancement in its uses and capabilities. The most evident one is how fast academic cheating using general-purpose AI has shifted from negligible to widespread. Before measures could be implemented to regulate general-purpose AI in academic cheating, almost everyone could access and use it for any of those tasks. As long as evidence for risk remains incomplete, decision-makers cannot know whether the risk will merge or has emerged; this is known as an evidence dilemma. This creates a tradeoff, though; implementing preemptive or early mitigation measures might prove unnecessary while waiting for conclusive evidence could leave people and society vulnerable to risks. This can be reduced by implementing early warning systems and risk management frameworks that lessen the dilemma. The EW and risk framework in two ways: triggering specific mitigation measures when there is new evidence of risks or requiring developers to provide evidence of safety before releasing a new model. 5. Information gap.

AI companies do not release or share information about their general-purpose AI systems, especially before release. This limits the information available to policymakers and risk managers in non-industry research and governments in implementing risk mitigation strategies. Companies cite reasons around commercial and safety concerns as a reason to limit information sharing; however, this information gap limits risk management from other actors.

This could also be looked into in relation to the competitive pressure on AI companies and governments, forcing them to focus on developing systems rather than prioritizing risk management. The competition forces the players to focus more on investment in other resources than risk management.

Nonetheless, we have techniques and frameworks for managing risks posed by general-purpose AI. Press enter or click to view image in full size

Companies and regulator groups can use the existing methods, techniques, and frameworks to identify and assess risks. We also have methods for mitigating and monitoring these risks. 1. Assessing general prose AI systems for risks

This approach relies on spot checks, i.e., testing the behavior of general-purpose AI in specific situations. This makes it severely limited. Remember what we said about the evidence dilemma? If we cannot conceptualize the risks, we cannot test for them by assessing them. The method can help surface potential hazards of the models before they are deployed. However, some of these hazards may be missed or overestimated/underestimated. Additionally, there is a difference between the real world and test conditions; users, if they choose to, can find new ways to misuse or expose themselves to unknown risks when using general-purpose AI. 2. Evaluators need substantial expertise, resources, and sufficient access to relevant information

If we want risk identification and assessment to be effective, the evaluators require substantial expertise and resources. They also need to have access to relevant information. Rigorous assessment of risks requires combining multiple evaluation approaches. The evaluators need to understand different facets of user needs before testing for edge cases and novel risks. Evaluators need time and more direct access to the models and the training data. This means the evaluators would best be placed within the company to access the data and information about the technical methodologies used in training the models. Since companies would not provide such information and data to outsiders, the evaluators and ethics boards should be in-house. 3. Training general-purpose AI to function more safely.

Despite the investment in training methods focusing on user safety, no current method can reliably prevent even overtly unsafe outputs. One method that focuses on exposing the models to conditions for them to misbehave or fail during training is adversarial training. The aim is to build resistance to these cases. Regardless, adversaries still find new ways to attach or circumvent the safeguards with low to moderate efforts. The data and feedback from users used to train the models are unreliably imperfect, which results in the models misleading the users on difficult questions and making errors. There is promise, however, in methods that focus on using AI to detect misleading behaviors. Press enter or click to view image in full size 4. Monitoring techniques

This entails identifying risk and evaluating a model's performance when it is in use. The process also involves different interventions to prevent harmful actions. This improves the safety of a general-purpose AI after it is deployed to users. Current monitoring strategies monitor system performance and identify potentially harmful inputs/outputs. However, skilled or moderately skilled users can circumvent the safeguards. This creates significant research areas around using hardware-enabled mechanisms to monitor and, more effectively, prevent these circumnavigations. 5. Safeguarding privacy

One of the biggest concerns when using general-purpose AI is how it treats private data. While people might try as much as possible not to share private information when interacting with general-purpose AI, the breadcrumbs shared can likely be used to create the user profile. Multiple methods help safeguard privacy across the AI lifecycle to prevent this risk. These methods include removing sensitive information from training data and controlling how much information is learned from training data, such as differential privacy and confidential computing. However, many privacy-enhancing methods from other research fields are not yet applicable to general-purpose AI systems due to the computational requirements of AI systems. Conclusion.

It is possible to mitigate some of the risks associated with general-purpose AI- frameworks, policies, and approaches exist. However, we still have a long way to go before all the conceivable dangers are mitigated or managed effectively. As a result, there is a gap in research to come up with Beter and more effective methods to address risks in general-purpose AI, especially when the growth rate is as fast as it is. In the following article, we will explore different frameworks that have been proposed to help with AI risk management. These frameworks include the NIST AI risk management framework and the EU AI Act.