Over the years, as Artificial Intelligence and allied fields have started growing at breakneck speeds, there have been some very wild claims which have been made. Perhaps, one of the wildest is that we would have machines that can code any complicated algorithm when given a problem statement. AI can indeed eliminate a large section of the workforce in the coming years. However, we still are several years away from them, affecting the lives of software engineers and other Computer Science professionals.
Despite the sophistication and successes of Natural Language Processing, it has been shown to possess many restrictions. One such limitation is their ability to solve algorithmic problems, more specifically competitive programming problems. In an attempt to build a model that could solve ICPC problems, NEAR.AI had been working on creating a dataset to solve ICPC problems. They had made a public appeal to all programmers on Codeforces in a blog post to participate in the movement.
The deal was that programmers would logically interpret the problem in the form of statements and also optionally submit code for the same. The Codeforces community is one of the largest competitive programming communities in the world, consisting of some of the best programmers like tourist and Petr. As a result, it would make sense to ask members of this community to contribute to solving the problem. They hoped that they would be able to train the machine in the process of building logic to any given problem. The company was paying a significant amount to the people who were willing to help them out in building the dataset. But eventually, despite generating a moderately large dataset, they did not achieve much success. They published a paper reporting the same. They also addressed the shortcomings of the state of the art Natural Language Processing Algorithms to solve these problems.
But they are not alone. Multiple research groups are working around the world on the program synthesis problem. In this article, we would look at how NEAR.AI attempts to solve the issues that these research groups have not succeeded in solving.
Who works on Program Synthesis?
Some of the major research groups that work on program synthesis include:
All these groups have made some amount of progress in trying to solve this challenging problem, but are nowhere close to building a commercially viable solution to the problem. As of today, we do not have a system that can code anything beyond simple programs. Such a system is not useful in the industry.
The biggest challenge is to build a system that can solve any given problem statement(Assuming it is in English). Typically, most competitive programmers solve a large chunk of the problems by mapping the given problem to a similar problem they have solved before in their head and then reducing it to that problem.
Some of the significant challenges include:
- Lack of sufficient usable data: There is a shortage of usable data to train such a deep learning model. There are challenges that include the difference in the programming styles of various programmers, challenges based on the wording of the problem statement, etc. In fact, the lack of data is one of the biggest causes of a lack of research interest in this field. Most researchers prefer to work on existing datasets rather than figuring out where to get data from. This is where competitive programming can be conducive. Platforms like Topcoder, Codeforces, and Codechef organize contests quite regularly, where several users make correct submissions of different solutions to the same problem. If this data is harnessed to build a dataset, this will make program synthesis a trendy field of research.
- The inherent uncertainty of Deep Learning models: Deep learning models work on the principles of probability and statistics. Every possible output has a particular prediction value associated with it. The model then chooses the most likely value. Hence, there is a significant chance that at some step of the program synthesis process, a wrong decision is made concerning the given problem making the entire generated code worthless.
Near.AI is a startup that is working on solving this problem. They are trying to build a system that can potentially circumnavigate all of these issues. A brief overview of what they intend to do is given below.
Firstly, they intend to get rid of the fluff given in the problem statement. Such statements are usually given to provide context to the problem but are unnecessary for solving the problem. They intend to rewrite problem statements in such a way that only a very formal and dry description of what exactly needs to be done is left. It is supposed to be the input into the pipeline.
Data Augmentation Step
An algorithm is used, which gets a solution as an input, and produces a very low-level statement (very close to the solution) as an output. Training on such a generated dataset achieves a non-zero accuracy on the human-generated dataset. The closer we get the generated statements to what people write in terms of the measurable metrics, the better the accuracy of the trained model on the human-generated set is.
“Non-zero accuracy” might not sound very impressive. But, if you consider the fact that it effectively means that the model manages to learn how to solve some elementary human-generated problems by only being trained on synthetic data, it is very promising. It is a huge step towards solving more complex problems and ultimately solving actual contest problems in a live contest.
Fixing the uncertainty of Deep Learning models: AST Trees
At the core of our approaches are deep learning models that read in text and produce code as either a sequence of tokens or an AST tree. Either way, as mentioned above, even a very well trained model has a very high chance of messing up at least one token.
Trying to address this problem is on its own very close to what competitive programmers do. We explore quite a few approaches. The idea NEAR.AI is using is to perform a beam search on AST trees until we find an AST of a program that passes sample tests. However, due to the lack of good quality data, they were unable to validate their studies satisfactorily.
An Alternative Approach
Another approach they are working on is to let another (or the same) deep learning model fix the program if the first program that was generated doesn’t pass the samples. The high-level idea is to train a model that takes as an input a problem statement and a program from the previous iteration alongside with some feedback on why the program is wrong and generates a new program.
Issues with the model
Consider that while running a binary search, it mistakenly swaps lines left = middle and right = middle. If you then feed the code back to the model, spotting this bug would be extremely hard. Even for humans, it is a non-trivial task. So is the source code the best medium for feedback to the model?
One option we have been exploring is feeding an execution trace instead. If the model made a mistake above, the execution trace would always converge to one of the ends of the range instead of some value in the middle, which would provide more information to the model than just the code itself. But, the issue with this solution is that the execution traces of real programs are rather long, and condensing them only to show information that is likely to be relevant is still an unsolved problem.
The different approaches that have been discussed and researched have some forms of blockades that we have not been able to go past. For machines, automatically reading the problem statement and automatically converting it to code is not trivial. We first need to interpret the language correctly and then need to find a mapping. Often, due to a difference in wording, we may not find a suitable mapping. While we have systems in place which can build simple programs to solve problems, we are still very far away from reaching a stage where. But building a commercially viable product that could do so would open up our world to endless possibilities. It would help corporates to save a significant amount of money and labor time. Hence, this is a field that even the corporate research organizations must consider for research. The upsides of such a technology would be unimaginable.