OpenAI has announced that it has developed an AI model called ‘ CriticalGPT ‘ which detects errors in ChatGPT. CriticGPT is based on GPT-4, just like ChatGPT.
Finding GPT-4 errors with GPT-4 | OpenAI
https://openai.com/index/how-to-find-gpt4-errors-with-gpt-4/
Chat AI, such as ChatGPT, allows for code generation and long sentences to be created with few operations. However, the code and sentences generated by chat AI are often buggy, and there are reports of “using ChatGPT-generated code as is, leading to bugs and real-world damage.”
A failure story where a bug in the code generated by ChatGPT was overlooked, resulting in a loss of more than 1.5 million yen – GIGAZINE
OpenAI has developed a new AI model called ‘CriticGPT’ that detects and fixes errors in ChatGPT. CriticGPT is a model developed based on GPT-4, and its code error detection and correction capabilities have been improved by learning ‘code that includes manual errors’ and ‘phrases that correct code errors’.
Below is an example of how to use CriticGPT. CriticGPT notes that the startwith method is not appropriate for this purpose and offers an alternative to the code generated by ChatGPT.
The graph below compares the integrity of code reviews by humans (green), CriticGPT (orange), and humans and CriticGPT (pink). We can see that CriticGPT’s review is more complete than the human review.
Below is a graph comparing the percentage of code reviews that contain false information (
hallucination). We can see that the percentage of hallucinations is lower when humans use CriticGPT.
OpenAI said, “We need better tools to tune increasingly complex AI systems” and intends to continue developing tools to tune AI results like CriticGPT.