Can Machine Learning Determine the Quality of a Milestone?

Paul Groenewoud
Jan 20, 2023
3 min read

Are you looking for a way to ensure consistent assessment of data quality in financial project audits? Machine learning could be the answer! This post will provide an overview of how machine learning can be used to assess the text quality of milestones in a financial project audit system. We'll also discuss the key challenges and lessons learned, potential benefits, and implications of using machine learning for this purpose. Finally, we'll explore data collection and options for data quality awareness in production. Get ready to take your financial project audits to the next level!

Milestones

Milestones are important to the success of any project, regardless of size or complexity. They help track and communicate progress, measure success, and provide a reference point for the team and management

The Client

Our client’s interest was sparked to use machine learning as a way to neutrally and consistent, evaluate the text formulation of thousands of milestones, created in a process driven, financial project auditing system. The rating of the milestone formulations would enable a KPI, which would enhance the data quality awareness and mitigate auditing risks.

The Request

We were approached for a feasibility study to determine if Machine Learning can determine the quality of a milestone formulations? The business requirement was to provide statistics for a milestone quality KPI. A solution would have to fit into the client's technology stack and be maintained by their existing support structure.

In this blog post, we will explore the potential of using machine learning to assess milestone quality and discuss some of the results and implications that have been observed. We hope this discussion will help shed light on the potential benefits and associated challenges.

The Study and Concept

While also considering IBM’s Watson and Google’s TensorFlow, we decided to use Microsoft's ML.NET. ML.NET is in-line with the client's existing technology stack and is compatible with Google's very expressive TensorFlow models.

Our analysis of the data revealed, as the client indicated, that the process often naturally dictated the formulation of milestones. This reduced the expressiveness of the milestones, required more effort to assess and thus poses as a risk.

To develop the Machine Learning Model to assess the quality of milestones, the quality range used to rate the milestones were reduced from five to three points. This allowed the reduction of data required by 40% and narrow the focus. In short, the model converts the milestone’s text to a number, 1, 2, or 3. Sample data and validation data is required in equal amounts for the three different categories.

A typical first hurdle to overcome was to enable the client to provide sufficient training and validation data, fast and efficiently. In a few iterations, a data collection app (POC) was developed, allowing engineers to rate or adjust existing milestones, with minimal effort. Excel is great, a custom UI increase the data collection efficiency immensely and provided additional data capturing capabilities.

Below the concept of three variations of milestone ratings. The user would select an own rating – the Current ML Rating shown. Additionally, the user can change the text as in the third example.

The machine learning model was created intuitively, and after a few iterations and more data, it was clear that the goal was well within reach. One can use machine learning to analyze the properties and assess the quality of a milestone.

Initially the focus was the feasibility and to raise awareness of data quality using KPIs. However, in the process of developing the data collection tool, the focus shifted to the added value of an accurate assessment of the milestone text, at the time of data input.

In addition to seamlessly translating the business requirements and the delivering a positive result for the feasibility study, we were able to support the proposal of adding direct data quality feedback, at point of capture, to the steering committee of the data generating auditing system. This kind of data quality awareness would be much more efficient and to a wider audience.

Learnings

Machine Learning can be used to assess a milestone's text formulation. This helps to mitigate potential auditing risks and may have a positive impact the overall achievements of an organization.

It is important to note that the accuracy of milestone quality estimations, in addition to a carefully designed machine learning model, heavily depends on the quality and relevance of the training and validation data.
Even in small projects, an agile process is a basis for good collaboration with the client.
Although this was only a one-person project - the success was that of the collective, the client and colleagues all working together, even though they are not “on a project”.
Trust, transparency and the permission to ask questions outside of the context, provided an environment where great results can be achieved.

If you have questions that ChatGPT, can't answer right away, please get in touch with us. The latter will be evaluated out of curiosity in this subject, so be tuned to follow the results here.