Open Source OCR using Tesseract and Google Colab
Optical Character Recognition (OCR) has been a use case in Computer Vision. The popularity is because of its wide range of applications. It can be used for Data Entry for Business, Number Plate Recognition, and many more. Basically, any application where we have to extract text from an image.
Tesseract is the most available open-source software for OCR. The original software is available as a command-line tool for windows. Because Python is the most popular language used now a days, Tesseract has now been developed and implemented in Python too and is open source.
Python-tesseract is a wrapper for Google’s Tesseract-OCR Engine. It is also useful as a stand-alone invocation script to tesseract, as it can read all image types supported by the Pillow and Leptonica imaging libraries, including jpeg, png, gif, bmp, tiff, and others.
Additionally, if used as a script, Python-tesseract will print the recognized text instead of writing it to a file.
Ordinarily, OpenCV tasks are resource intensive — and if you are running low on processing power, the best way is to use Google Colab.
Google has done the best thing ever by providing a free cloud service based on Jupyter Notebooks that supports free GPU. Not only is this a great tool for improving coding skills, but it also allows absolutely anyone to develop deep learning applications using popular libraries such as PyTorch, TensorFlow, Keras, and OpenCV. Colab provides GPU and it’s totally free.
Here are the steps to extract text from the image in Google Colab Notebook for OCR using Pytesseract:
Step1: Install Pytesseract and tesseract-OCR in Google Colab.
!sudo apt install tesseract-ocr!pip install pytesseract
Step2: import libraries
import pytesseract
import shutil
import os
import random
try:
from PIL import Image
except ImportError:
import Image
Step3: Upload Image to the Colab
We can manually upload the image by clicking on file- upload but we can also use the following code for uploading the image to Colab.
from google.colab import files
uploaded = files.upload()
Step4: Text Extraction
The image_to_string function will take an image as an argument and returns an extracted text from the image. We can either directly print it or store this string in one variable.
image_path_in_colab=‘image.jpg’
extractedInformation = pytesseract.image_to_string(Image.open(image_path_in_colab))
Say, we want to use this sample text as our source,to test our OCR :
After processing in Google Colab,we get the following result:
The GitHub repo for the project is available at : https://github.com/suyesha07/Optical-Character-Reader
Feel free to check this Colab Notebook in: https://colab.research.google.com/github/suyesha07/Optical-Character-Reader/blob/main/OCR.ipynb