Image Understanding

AI analyzes every image for text (OCR), objects, scenes, and visual content. Generates searchable captions and descriptions so you can find images by what is in them, not just filenames.

How it works

Azure AI Vision processes each uploaded image through multiple analysis passes. OCR extracts any visible text -- handwritten or printed. Object detection identifies what is in the image. Scene classification describes the overall context. A natural language caption is generated combining all these signals into a searchable description. All of this becomes part of the knowledge item's searchable content.

Image analysis showing OCR text extraction and object detection

Why it matters

Images often contain critical knowledge -- whiteboard photos, screenshots, diagrams, handwritten notes. Without analysis, they are black boxes that search cannot penetrate. Image understanding makes visual content as findable as text. Search for 'architecture diagram' and find the whiteboard photo from last month's meeting.

Search results returning images based on their analyzed content

Related Features

Intelligent Enrichment

Face Recognition

Detects and matches faces against known people.

Intelligent Enrichment

Video Analysis

Key frame extraction, face detection, text recognition in video.

Document Parsing

Video Analysis

Get early access to Image Understanding

Create your free account and get access to Image Understanding today.

Get Started