Mismatched Encoding in Rate-Distortion Theory

Summer Internship (Remote)

Guide: Prof. Jonathan Scarlett
Duration
: May 2021 - February 2022

Rate-distortion is a branch of information theory that deals with finding the theoretical limits of lossy compression. In the most basic of settings, a random input string is fed to an encoder, which evaluates the codeword against a pre-specified "codebook" (literally, a "book" of "codewords", a list of strings that the input string can be encoded into) and spits out the index of the codeword that most closely resembles the input string. While decoding, the decoder outputs the codeword corresponding to the encoded index. The minimum average "distortion" incurred by such a system given the statistical properties of the information source is well-known result in rate-distortion theory.

One issue here, however, is that the encoder (which can be thought of more as an immutable utility once manufactured) and the end user might have different notions of what resemblance (and therefore distortion) means. This means that even when the encoder has a "good" codebook (i.e. one that minimizes distortion between the input and the output), for some input strings, it might not pick the optimal codeword. There exist examples where such a mismatch leads to the average distortion being strictly larger in case of a mismatch when codebooks are constructed using standard techniques.

In this project, we investigate the construction of codebooks using techniques from multi-user channel coding (another branch of information theory where the objective is to effectively communicate through a shared information channel even as transmissions from other users interfere with your own, akin to trying to talk in a crowded room) and establish achievable levels of distortion given a codebook size. We also prove that such codebooks achieve the "matched" minimum distortion in some of the aforementioned examples.

Creative Commons License
This image is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Image source: WikiMedia Commons