Computer Vision: Intelligent Automation that Can See

If you’ve read our blogs, you know that we discussed intelligent automation many times. Defined as the combination of Robotic Process Automation (RPA) and artificial intelligence (AI), intelligent automation can process unstructured and semi-structured data to automate end-to-end business processes and learn to increasingly improve performance as more data is fed into it.

Yet, the field of AI covers a broad territory of technologies that include machine learning, deep learning, natural language processing, and neural networks. And included in any discussion of intelligent automation is one increasingly important AI technology: computer vision. Computer vision is an essential component in any intelligent RPA solution. It’s already being used in a broad range of use cases, including invoice processing, insurance claims, and know-your-customer (KYC) initiatives at financial institutions.

Computer vision: a primer

Computer vision is a technology that is still being investigated, so it might be more appropriate to call it a “field of development.” It helps computers “see” and understand the content in digital images such as photographs, PDFs, diagrams, drawings, videos, and more.

Why is this important? Because of all the images that exist in digital form. The internet is made up of text and images. Text is easy to understand and search through. Images are a different matter. To be able to comprehend what images contain, we’ve depended on descriptions provided by humans. Those descriptions are called metadata: that is, data about data. But metadata such as “invoice” or “driving license” can’t give us the full value of all the content contained in those images. To get that value, we need computers that can see images the way humans do and understand their content.

Although this is a relatively trivial challenge for humans, it has turned out to be surprisingly difficult for machines. A human can look at a document and say, "This is an invoice," even if it is formatted differently than other invoices, has a different logo, or different colors or fonts. Or a human can look at an image of an animal and say, "It's a cat,” or “it’s a dog." Computers need to be trained to do this. This is done using machine learning—feeding the system lots of images and getting it to associate likes ones with each other.

Computer vision is an increasingly popular technology to incorporate into automation and digital transformation initiatives. The global computer vision market size was estimated at $10.56 billion in 2019 and is expected to reach $11.44 billion in 2020, according to Grand View Research.

Computer vision versus OCR technology

Many people confuse still-emerging computer vision with the older and fairly mature optical character reader (OCR) technology. OCR is a subset of computer vision that only performs text recognition. It extracts and digitizes printed, types, and some handwritten texts. It converts analog characters into digital ones.

In a way, OCR was the first limited foray into computer vision. Today, however, computer vision does much more than simply extract text. It uses machine and deep learning to look at an image from the same perspective as a human would—to understand all the content that it contains. For example, if given a photograph of a stop sign, OCR might be able to extract the letters S-T-O-P. But broader computer vision applications can analyze the shape of the sign, its color, and its position relative to other objects in a way that OCR cannot.

Intelligent RPA with computer vision use cases

Here are some ways that computer vision is already being used to power intelligent automation applications.

Invoice processing: Invoices come in all shapes in sizes, through all sorts of channels, and manifest themselves in all sorts of unstructured data in the form of images: email, fax, PDF, U.S. Postal Service. They could even enter your organization handwritten and hand-delivered. Before computer vision, humans were involved in processing the important information on the invoices: The name of the vendor, order number, PO number, remittance information, and so on. Intelligent automation that uses computer vision can recognize all the important content, extract it, and process it for payment using AI-based rules for end-to-end automation of what formerly was a tedious, manual task.

Know-your-customer (KYC) automated onboarding: Remote verification of customers for KYC compliance is increasingly important, especially with the pandemic and people’s inability or reluctance to meet in person. Banks and other financial institutions in select countries in Europe, Latin America, and Asia are now permitted to use remote video-based customer identification processes for KYC, enabling these institutions to remotely onboard customers instead of requiring them to physically visit offices. Intelligent automation with computer vision considerably streamlines the customer onboarding processes in geographies where remote KYC video is allowed.

Insurance claims automation: Collecting on home or car insurance after the damage has been incurred used to be a slow, manual affair that involved field inspections by insurance company agents, trips to mechanics or construction sites for estimates, and long approval processes. Today, with intelligent automation and computer vision, a homeowner or car driver can take a photo of the damage, send it in through an automated portal, have the repairs estimated, the claim approved, and the check sent without human interference.

Advancements continue

These three scenarios reflect just a fraction of what is—and what will be—possible with intelligent automation using computer vision. Among other use cases that are already emerging:

Equipment safety inspections
Automated retail checkouts
Medical diagnostics
Surveillance
Fingerprint recognition and biometrics

We’re keeping a close eye on this space to incorporate innovations as they become available. Stay tuned!

How Smart Is Your RPA Platform?

START WITH INTELLIGENT AUTOMATION

About Avi Bhagtani

Avi Bhagtani is senior director of product marketing, focused on artificial intelligence and cognitive automation. He has multiple years of industry experience managing global software product portfolios in software, the Internet of Things, AI, and cloud organizations.

View All Posts LinkedIn