Uutils at GSOC

Google summer of code is:

Google Summer of Code is a global, online program focused on bringing new contributors into open source software development. GSoC Contributors work with an open source organization on a 12+ week programming project under the guidance of mentors.

If you want to know more about how it works, check out the links below.

Useful links:

What is it about?

The uutils project is aiming at rewriting key Linux utilities in Rust, targeting coreutils, findutils, diffutils, procps, util-linux, and bsdutils. Their goal is to create fully compatible, high-performance drop-in replacements, ensuring reliability through upstream test suites. Significant progress has been made with coreutils, diffutils, and findutils, while the other utilities are in the early stages of development.

How to get started

Here are some steps to follow if you want to apply for a GSOC project with uutils.

  1. Check the requirements. You have to meet Google's requirements to apply. Specifically for uutils, it's best if you at least know some Rust and have some familiarity with using the coreutils and the other tools.
  2. Reach out to us! We are happy to discuss potential projects and help you find a meaningful project for uutils. Tell us what interests you about the project and what experience you have and we can find a suitable project together. You can talk to the uutils maintainers on the Discord server. In particular, you can contact:
    • Sylvestre Ledru (@sylvestre on GitHub and Discord)
    • Terts Diepraam (@tertsdiepraam on GitHub and @terts on Discord)
  3. Get comfortable with uutils. To find a good project you need to understand the codebase. We recommend that you take a look at the code, the issue tracker and maybe try to tackle some good-first-issues. Also take a look at our contributor guidelines.
  4. Find a project and a mentor. We have a list of potential projects you can adapt or use as inspiration. Make sure discuss your ideas with the maintainers! Some project ideas below have suggested mentors you could contact.
  5. Write the application. You can do this with your mentor. The application has to go through Google, so make sure to follow all the advice in Google's Contributor Guide. Please make sure you include your prior contributions to uutils in your application.

Tips

Project Ideas

This page contains project ideas for the Google Summer of Code for uutils. Feel free to suggest project ideas of your own.

Guidelines for the project list

Summarizing that page, each project should include:

Complete the ls GNU compatibility

Most of the features in ls have been implemented by now. However, a bunch of work remains on the color side for a full GNU compat. Other tests are failing. We have 12 remaining failing tests.

To get the list of failing tests, run:

$ ./util/remaining-gnu-error.py |grep "/ls/"

Complete the cp GNU compatibility

Most of the features in cp have been implemented by now. However, some corner cases needs to be implemented. We have 16 remaining failing tests.

To get the list of failing tests, run:

$ ./util/remaining-gnu-error.py |grep "/cp/"

Complete the mv GNU compatibility

Most of the features in mv have been implemented by now. However, some corner cases needs to be implemented. We have 10 remaining failing tests.

To get the list of failing tests, run:

$ ./util/remaining-gnu-error.py |grep "/mv/"

Improve stty

The stty utility is currently only partially implemented and should be expanded.

See issues: #3859, #3860, #3861, #3862, #3863.

Improve findutils coverage

More than half of the findutils GNU & BFS are passing. The goal of this project is to improve the compatibility of uutils/findutils with regard to GNU's implementation.

See https://github.com/uutils/findutils

To achieve this, we should invest in fuzzing findutils: Just like we are doing with some Coreutils programs, we should:

Localization

Support for localization for formatting, quoting & sorting in various utilities, like date, ls and sort. For this project, we need to figure out how to deal with locale data. The first option is to use the all-Rust icu4x library, which has a different format than what distributions usually provide. In this case a solution could be to write a custom localedef-like command. The second option is to use a wrapper around the C icu library, which comes with the downside of being a C dependency.

This is described in detail in issue #3997.

And was also discussed in #1919, #3584.

procps: Development of Process Management and Information Tools in Rust

This project focuses on creating Rust-based implementations of process management and information tools: ps, pgrep, pidwait, pkill, skill, and snice. The goal is to ensure full compatibility with all options and successful passing of GNU tests, maintaining the functionality and reliability of these essential tools.

procps: Development of System Monitoring and Statistics Tools in Rust

This project involves the Rust-based development of system monitoring and statistics tools: top, vmstat, tload, w, and watch. The objective is to achieve full compatibility with all options and to pass GNU tests, ensuring these tools provide accurate and reliable system insights.

procps: Development of Memory and Resource Analysis Tools in Rust

The aim of this project is to develop Rust-based versions of memory and resource analysis tools: pmap and slabtop. The project will focus on ensuring full compatibility with all options and passing GNU tests, providing in-depth and reliable analysis of memory usage and kernel resources.

util-linux: Reimplementation of essential system utilities in Rust

The objective of this project is to reimplement essential system utilities from the util-linux package in Rust. This initiative will include the development of Rust-based versions of various utilities, such as dmesg, lscpu, lsipc, lslocks, lsmem, and lsns. The primary focus will be on ensuring that these Rust implementations provide full compatibility with existing options and pass GNU tests, delivering reliable and efficient system utilities for Linux users.

util-linux: Process and Resource Management: Reimplementation in Rust

This project focuses on the reimplementations of crucial Linux utilities related to process and resource management in the Rust programming language. The target utilities include runuser, sulogin, chrt, ionice, kill, renice, prlimit, taskset, and uclampset. The primary goal is to create Rust-based versions of these utilities, ensuring compatibility with their original counterparts, and validating their functionality with GNU tests.

util-linux: User and Session Management: Reimplementation in Rust

This project focuses on the reimplementations of essential Linux utilities related to user and session management in the Rust programming language. The target utilities include su, agetty, ctrlaltdel, pivot_root, switch_root, last, lslogins, mesg, setsid, and setterm. The primary goal is to create Rust-based versions of these utilities, ensuring compatibility with their original counterparts, and validating their functionality with GNU tests.

This project aims to modernize and enhance critical Linux utilities related to user and session management, ensuring they remain efficient, reliable, and fully compatible with existing systems.

Code refactoring for procps, util-linux, and bsdutils

Refactoring the Rust-based versions of procps, util-linux, and bsdutils to reduce code duplication.

A multicall binary and core library for findutils

findutils currently exists of a few unconnected binaries. It would be nice to have a multicall binary (like coreutils) and a library of shared functions (like uucore).

This also might require thinking about sharing code between coreutils and findutils.

Implementation of GNU Test Execution for procps, util-linux, diffutils, and bsdutils

The project aims at integrating the GNU test suite execution using the Rust-based versions of procps, util-linux, diffutils, and bsdutils, ensuring compatibility, crucial for seamless drop-in replacement integration. We have been doing such operation successfully for the Coreutils using GitHub Actions, a build script and a run script.

Refactoring factor

The uutils factor is currently significantly slower than GNU factor and only supports numbers up to 2^64-1. See issue 1559 and issue 1456 for more information.

Symbolic/Fuzz Testing and Formal Verification of Tool Grammars

See Using Lightweight Formal Methods to Validate a Key Value Storage Node In Amazon S3.

Most KLEE scaffolding was done for KLEE 2021.

Start with wc, formalize the command line grammar. Get it working under AFL++ and Klee. Add several proofs of resource use and correctness - especially proofs about operating system calls and memory/cache usage. Generalize to other tools. Try to unify the seeds for the fuzzer and KLEE so they can help each other find new paths. Use QEMU to test several operating systems and architectures. Automate detection of performance regressions - try to hunt for accidentally quadratic behavior.

Specific to wc - formalize the inner loop over a UTF-8 buffer into a finite state automata with counters that can generalize into SIMD width operations like simdjson. Further generalize into a monoid so K processors can combine results.

Development of advanced terminal session recording and replay tools in Rust

This project involves creating Rust-based implementations of /usr/bin/script, /usr/bin/scriptlive, and /usr/bin/scriptreplay. The /usr/bin/script command will record terminal sessions, /usr/bin/scriptlive will offer real-time recording features, and /usr/bin/scriptreplay will be used to replay recorded sessions.

The work will happen in https://github.com/uutils/bsdutils.

Official Redox support

We want to support the Redox operating system, but are not actively testing against it. Since the last round of fixes in #2550, many changes have probably been introduced that break Redox support. This project would involve setting up Redox in the CI and fixing any issues that arise and porting features over.