Using Modern Software Practices in Computer Science Curriculum Development
Software engineering has changed quite a bit in the past few years with the proliferation of the DevOps paradigm. This engineering practice utilizes continuous integration and deployment tools like Jenkins and Travis CI to promote putting new code and features into production sooner rather than later. This is achieved by automatically building and testing software whenever changes are made, giving developers the ability to know ahead of time if their code is likely to break.
DevOps-driven development is often complimented (and accompanied) by the use of containerization, which is the packaging of software inside of a small virtual environment, which can be transported to and executed on many systems, even if they are not configured the same. The upsides of containerization are numerous, but the biggest one for most teams is the ability to write code on a developer machine and ensure that the program functions the exact same in production.
So that brings us to the topic of this post, which is how my team used containerization and the DevOps philosophy to write better computer science assignments.
In October 2019, I started a new position at Purdue University as a Curriculum Development Team Lead in the Computer Science (CS) department. In this position, I supervised a team of software developers and quality assurance testers who wrote assignments for a computer science summer program. Previously, I had been a teaching assistant (TA), where I also wrote many assignments for the introductory computer science course. Going into this new position, I wasn’t sure exactly how we should accomplish our goals, but I knew that we had to change things up a bit.
Many teams in industry have used new development strategies like DevOps and containerization for some time. However, when I wrote assignments as a TA, we stored all of our assignments in Google Drive, rather than in a source control manager, like GitHub. In fact, many developers did not even use source control on their personal machines. We also needed to manually configure assignments on our grading software and supply a script to run the grader.
After developing assignments as a TA, I went on to land an internship at a tech company in Indianapolis, where I learned the Agile process and DevOps-driven development. So when I was given the opportunity to manage my own team of developers, I took some of the things I learned from my internship in order to improve our workflow.
Because I was creating a brand-new project and team, I had to first get our basic workflow set up. This included setting up source control and project management tools, defining a template for assignments, and defining quality control standards.
Our team went back and forth on a couple of source control management (SCM) tools, but we settled on GitHub.com. GitHub offers a continuous deployment/integration tool, GitHub Actions, which became a vital tool later when we set up these processes. GitHub’s interface was also most familiar to our team, so it was a no-brainer for us.
We also tried quite a few tools for project management, but the best by far was Jira Software. Jira is free for teams of up to 10 people, and if you need more seats, they have special education pricing. Jira’s presets are fine for most teams, but we elected to mess around with our project workflows in Jira so that we could automate a few things.
It was important for our team’s success to have a template set up for creating new assignments. This would allow developers to quickly start working on an assignment without worrying about setting up boilerplate code each time. This was achieved by adding files to a template repository that do not have to change between assignments (e.g., build scripts, editor configurations, etc.). After a few months, we decided to switch to Maven for building and testing, which also helped tremendously with templating. Our lab template repository is available on GitHub.
Another very critical part of our workflow was a robust quality control system that assured that assignments were error-free before releasing them. If assignments are released with errors, the instructional staff needs to take time to amend these errors and keep the students informed of changes. This takes away valuable time that can be devoted to teaching and assisting students with their assignments. Additionally, changes to assignments and their requirements can easily frustrate students (computer science students in particular, from my personal experience).
Every line of code or documentation that would be released was first reviewed by someone on the team who did not have a part in writing it originally. This way, a fresh set of eyes could check for code quality, style, and errors. We called this a “code review.” After code review, the assignment was passed on to a sub-team of quality assurance (QA) testers that would complete the assignment, making note of any errors that were uncaught, as well as checking that the assignment accomplished the learning outcomes and could be finished by the students in the time they would be given.
Containerization is the process of packaging a software application into a portable virtual environment that can be ran predictably on any machine. This is achieved by bundling the program along with all of its dependencies. Containerized applications, commonly referred to as “containers,” can be deployed (and run) significantly faster than full-blown virtual machines, as all containers share the same kernel, meaning less time is required to spin up new instances of an application. This is particularly useful for automatically-graded submission systems, as a new “autograder” can be created quickly whenever a student submits their code.
Docker is the industry standard for creating containerized applications, and it has a large community of applications and services that support it. One service that is of particular interest to curriculum development is Gradescope, which is a grading service commonly used in higher education. Most faculty and instructors that have used Gradescope recognize it as a means to grade exams and written homeworks; however, it also has functionality to create automatically-graded programming assignments. Gradescope has superb documentation on how to set up an assignment to use their autograding system. We chose Gradescope as our autograding solution, and it has proved to be an excellent tool to quickly grade students’ code and provide meaningful feedback.
One key feature of using Docker for programming assignments is the ability to build images off of a “base image” that can contain most of your dependencies, especially if you are using a template for assignments. For instance, we created our own autograder image that installs dependencies (like the Java JDK and our code linter) and adds a script that Gradescope can invoke to begin grading an assignment. This image can be used as the base image for our assignment images, which makes building assignments much quicker.
After we build our assignments, we publish them to a Docker repository that Gradescope has access to. Then, whenever a student submits an assignment, Gradescope will automatically pull the autograder for that assignment to begin grading the submission. We built a grading framework, Percolator (formerly Barista), that will run the tests and output a JSON file that contains the score for each test case. Gradescope reads this JSON file and saves the score, which is available to students after the autograder is finished.
With our assignments now in containers, we decided that it would be nice if they were automatically updated when new changes are merged. This is known as continuous deployment, which is the practice of automating the deployment of new features, bug fixes, or other changes. In order to create a robust continuous deployment system, however, a continuous integration workflow should also be in place. Continuous integration is a system that automatically checks if new changes are compatible with the current version of the codebase. This is accomplished by not only checking for merge conflicts, but also building the code and running tests against it.
As mentioned previously, we chose GitHub Actions as our continuous integration and deployment tool. It is built-in to GitHub, and, with our GitHub Team subscription, we are provided more than enough monthly credits to build all of our assignments. When one of our developers opens a pull request, their code is built using Maven, and the test cases they wrote are tested against their solution. This allows us to find discrepancies between the expected behavior and what the test cases are enforcing. Even if another developer approves the new changes in a code review, the code cannot be merged unless the build completes successfully and all test cases pass.
When changes are merged into the main branch, a new build kicks off which automatically deploys the latest version. During the deployment process, a Java archive (JAR) file is created, which contains the test cases, the developer’s solution, and the dependencies for testing. If the build passes, a Docker image containing this JAR file is built and published to Docker Hub, where Gradescope can access it. All of our GitHub Actions workflows can be found in our lab template repository.
Using containerization and continuous deployment in the creation of programming assignments is made fairly easy with tools like Gradescope and GitHub Actions. Since adopting this new workflow, our team’s productivity has gone up considerably, as developers no longer need to manually deploy their assignments. The methods we used for creating Java programming assignments can be modified to work for any number of other languages and frameworks, so if you are an instructor or curriculum developer, I implore you to test this out.