Blogs
June 10, 2025
00:00

Apple AI research shows reasoning models collapse when problems are more complex

A research paper from Apple published on June 6 stated that although large reasoning models (LRMs) showed improved performance on benchmarks, they struggled with accuracy when the problems became more complex. Titled, “The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity,” the paper revealed that even the most advanced AI reasoning models collapsed entirely when facing harder problems. 

“They exhibit a counter- intuitive scaling limit: their reasoning effort increases with problem complexity up to a point, then declines despite having an adequate token budget,” the paper noted. 

To test the AI models, the researchers categorised the problems into low complexity, medium complexity and high complexity tasks which included a bunch of puzzles like Checkers Jumping, River Crossing, Blocks World and the Tower of Hanoi. 

The researchers picked Claude 3.7 Sonnet and DeepSeek-V3 from among the large language models and the Claude 3.7 Sonnet with Thinking and DeepSeek-R1 from among the large reasoning models. The research concluded that both the types of AI models had a similar level of capability. 

For low complexity problems, the models were found to solve the puzzles but as they proceeded to the high complex category, both AI models failed to work. 

The hardware giant has been seen as lagging behind in developing AI technology. Notably, Apple’s annual Worldwide Developers Conference is also expected to begin later today.

Published - June 09, 2025 02:58 pm IST