Nested Loop Breaking in C#

1 year ago · March 9, 2018 · 10:18 pm

During my C# endeavours this week I implemented an optimistic algorithm to find the most recent commit in which context a specific file has been modified. For performance reasons I chose an optimistic approach where, in the best-case, I assume the commit of interest to be at the very first position of the list I was enumerating. In the worst-case we have to go through the whole list, yielding at least n comparisons given a list of length n. In the average-case, we find the commit somewhere in between, and can stop the enumeration as soon as we have found it. This means a complexity class of O(n) for this algorithm.

So far for the functional requirements, but how to implement it regarding maintainability and readability? My first approach was the traditional way of using nested for-loops. This, one the one hand gets the job done, but one the other hand is ugly to read and I have to say not satisfying. In C# we also have the possibility of using a GOTO statement. However, ugliness levels are significantly risen and I can't quite remember to have used this jumping control flow style since my BASIC times in the mid 90's. Surely there might be situations in which GOTO is less evil than other choices, but personally I discourage its use and would only fall-back to it as my very last resort.

Okay, what other options do we have?

The next feasible approach is to extract the nesting into a dedicated method, which returns prematurely. That way, our code is more readable since our calling method is delegating functionality it should not be responsible for. Remember, one method, one purpose! :-D
Nonetheless, one might still argue that debugging is now slightly harder since we jump from method to method and potentially lose context. A small price I'm willing to pay.

With the extract method pattern, we have arrived somewhere between traditional approach and best-practice, which is better. Now there is one code construct we can use to improve code readability and to totally get rid of the nesting. Namely, with the yield keyword, C# gives us the possibility to break out of enumerators. This is similar to streams as we know them from Java. An enumerator is the C# term for iterators in Java. Instead of for-loops, we make use of the generic IEnumerator<T> interface and the LINQ .SelectMany to basically arrive at a one-liner for our method head. Our method body is still declared in an extracted, non-anonymous method, but this time we use yield return commit, to tell the enumerator to stop instantly if we found our commit. That way, we have combined the advantages of the extract method pattern with a high level of readability, while not loosing our algorithm's capability to hit its average-case.

I'm excited to discover further usecases in which the yield approach leads to improved code quality, and I am also eager to discover its underlying mechanics - whether its syntactic sugar, or if we actually have performance impacts, and if something , in which complexity class they lie in.

Till next time, happy refactoring everyone!