OCR text or letter recognition – a definition of optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic recognition and translation of images of handwritten, typewritten or printed text (usually captured by a scanner) into machine-editable text.
How does optical character recognition work?
Optical character recognition (using optical techniques such as mirrors and lenses) and digital character recognition (using scanners and computer algorithms) were originally considered separate fields. Because very few applications survive that use true optical techniques, the term OCR has now been broadened to include digital image processing as well. That means that images of text documents can turned into electronic documents through the scanning and recognizing of the letters in the image, and put into digital storage.
Optical Character Recognition software
Early OCR programs required training (the provision of known samples of each character) to read a specific font, but these days, OCR software with a high degree of recognition accuracy for most fonts are common. Nuance’s OmniPage 17 OCR software for example, is capable of reproducing formatted output that closely approximates the original scanned page including images, columns and other non-textual components.