Document recognition is one of the most important parts of Basecone. Finding the right fields on a document, so that they can be used when booking an invoice. Currently, this is done by static logic built into the product. What if we could make this logic smarter by training models that can adopt this recognition?
By applying artificial intelligence (AI), we can teach a machine (Machine Learning) to recognize certain fields by means of existing examples. Corrections will also ultimately ensure that documents will be better recognized. How does such a thing work?
To be able to apply Machine Learning, you always need sample data. The sample data in this case are documents of which we have labeled the recognized fields on a document itself. So, we know for sure what the fields on the document mean and then tag them using an automated annotation system where the fields posted by the user on the booking screen are used to generate annotations. In order to train the model in production, we have labeled more than 1 million documents. This sample data, in combination with the right statistical model, ensures that we can predict the recognition of new incoming documents.
The advantage of this statistical model is that it does not only look at the values of the fields themselves. But it also looks at the layout of the document and the positions of the fields on the document.
Machine Learning becomes more successful when you have as much sample data as possible and by making sure we have varied samples. So we have trained the model with more than 1 million invoices from different offices to make sure we have a good amount of data. By making sure we pick invoices from different offices, we incorporate varied types of invoices.
Basecone customers correct the recognition in the booking screen when the recognition of a document is not successful. After this correction, we know which fields should have been recognized. We use this data as sample data to train our machine learning model. This will make the model smarter due to the corrections made by our customers.
We have trained our machine learning model with more than 1 million invoices from different offices. By testing this model with another set of invoices, we saw an improvement in recognition compared to the legacy application. To make sure the model performs good we wanted to test more invoices in live environment, for which we did shadow mode testing for around 50 offices, which means that the outcome of the model for some of the processed documents is stored in the background. The user will not see any of this, as the current static recognition service is still being used at the front end. In this way we can easily compare the current recognition service with the new service (the AI model).
The AI model trained has been tested for invoices from Belgium (BE), Netherlands (NL) and UK (GB). The results are promising, regarding accuracy of recognition for total amount field, machine learning model considering confidence score (how sure is the model about its prediction) of above 10% per language, shows an improvement of 3% on average.
Our goal is to get as close to 100% as possible, but 100% accuracy is not realistic in this case. We are always dependent on a number of factors, such as the quality of a document.
We have chosen to initially focus on one field of an invoice and the above results relate to this field. This concerns the “Amount (incl. VAT)” field. The reason for this is that the analysis of recognition problems showed that in most cases it concerns the field “Amount (incl. VAT)”. Then we can add the other fields one by one.
This initiative will reduce the need for customers to make corrections by improving document recognition accuracy.