Blog
Education has come a long way. Classrooms are more connected, content is more interactive, and students are learning in ways that weren’t even possible a few decades ago. But behind the innovation in digital learning, some things remain the same, especially the pressure on educators.
From managing large classes to ensuring fair, consistent assessment, today’s instructors face an ever-growing list of challenges. The stakes are high, the work is complex, and too often, the tools simply aren’t designed with real educational settings in mind.
That’s where one professor at the University of Toronto found himself - managing a high volume of oral exams with no scalable, standardized way to grade them. So, he partnered with our team at Roca Mindhub and together to build a custom AI-powered platform that supports educators.
This case study explores how thoughtful custom software development reshaped the assessment process - and what it could mean for the future of scalable, fair, and supportive evaluation in higher education.
Oral exams have clear pedagogical benefits: they test for critical thinking, real-time reasoning, and conceptual understanding. But they also introduce a handful of issues that make them difficult to scale or standardize.
Our collaboration began with a deceptively simple question: How can we create fairer, faster, and more insightful assessments, without losing the human touch?
In tackling this, we focused on three systemic pain points in oral exams:
From day one, our goal was to support educators, not replace them. We weren’t building a generic AI tool or off-the-shelf educational software. This was a custom-built platform, designed to align with the course curriculum, the needs of teaching assistants, and the specific grading practices in place - all while accounting for the human factors that shape real classroom dynamics.
The result was an AI-driven assessment system for oral exams that could analyze student responses in real time. It didn’t just evaluate what students said, but how they said it - factoring in tone, pacing, and expression. By combining computer vision, natural language processing (NLP), and machine learning, the platform helped create a more consistent and supportive student evaluation process.
The platform used AI models to detect general emotional cues during student responses - aspects like visible stress, confusion, or low engagement. Importantly, it didn’t try to decode specific micro-expressions. Instead, it flagged patterns that might warrant a second look, such as:
These signals didn’t penalize students, but helped the system offer support. If the platform picked up on signs of discomfort, it will serve a follow-up question to give the student a second chance to demonstrate knowledge more clearly.
We also analyzed vocal delivery using AI-driven speech tools. Why? Because delivery can distort how performance is assessed, especially in high-pressure situations. The system looked for patterns such as:
Instead of letting these elements impact grades, the platform used them as signals. When something seemed off, it triggered follow-up questions from the curriculum, giving students a chance to explain further or clarify their thinking.
The core of the assessment still came down to “Did the student understand the material?. The platform used NLP to analyze content depth and identify whether students were truly engaging with key concepts. It looked for:
Just as with the other dimensions, if a student’s answer seemed shallow or rehearsed, the system tested for deeper understanding - pulling from a curated, course-specific question bank.
With the help of the AI system, TAs could focus on outlier responses - such as highly polished answers that might signal over-preparation, or unclear ones showing confusion or disengagement. This allowed them to spot patterns, better understand student performance, and apply their judgment where it mattered most. Rather than replacing the human element, the platform supported it, helping educators assess more fairly, consistently, and meaningfully.
The system continues to evolve by learning from ongoing feedback and usage data, with machine learning algorithms steadily improving their ability to identify patterns that indicate strong or weak student understanding. Future updates may include multilingual support to accommodate diverse learners, real-time transcription to enhance accessibility, and integration with learning management systems (LMS) for seamless use across various educational environments.
Looking beyond oral exams, this AI-driven assessment approach has broad potential applications. Because it emphasizes evaluating performance, behavior, and conceptual mastery, the platform can be adapted for interviews, language proficiency testing, remote certification, or any scenario that requires scalable, nuanced assessment of human understanding.
This project wasn’t just about building AI software, it was about enhancing what educators do. It shows that technology doesn’t replace human judgment; it works alongside it. Together, they can take on real challenges like scale, fairness, and deeper insight into how students learn.
As education continues to evolve, the tools we build must evolve with it. With the right collaboration between people and technology, better learning outcomes aren’t just possible - they’re within reach.