This course explores techniques for measuring and improving the accuracy of Large Language Models (LLMs) when generating structured outputs.
Students will learn to evaluate and compare the performance of GPT and Gemini models across different data types including numerical values, dates, and boolean values. The curriculum covers practical implementation with hands-on exercises in data extraction, accuracy measurement, and result comparison.
Special emphasis is placed on using explanation-based approaches to enhance model responses and streamline annotation processes. By the end of the course, participants will understand how to effectively assess LLM accuracy and implement strategies for obtaining more reliable structured outputs.