Reporting NUKE build failures on Slack

Let’s make our NUKE builds report failures by themselves, using a little reflection magic.

Reporting NUKE build failures on Slack

At Coolblue, we run a lot of builds and deployments. Because we practice CD, they all run automatically. This means that after we merge a pull request, our build server can spend up to half an hour building, testing, deploying, and validating. I don’t know about you, but I do not have enough attention span to watch slowly advancing progress bars for half an hour. So, instead, when one of the build or deployment stages fails, we’d like to get notified of this. Preferably on Slack, our messaging platform of choice, although this article could apply to any mechanism. Sure, most build servers have integrations with most messaging platforms nowadays, but what if you didn’t, or don’t want to rely on those?

You don’t want to have to write try/catch blocks all over the place, so it’d have to be something that you can either inject, or just run some code when the build is finished. Some build frameworks make this very easy. For NUKE, it’s as simple as overriding the OnBuildFinished method.

class Build: NukeBuild
{
  protected internal override void OnBuildFinished()
  {
    if (!IsSuccessful)
    {
      SlackTasks.SendSlackMessage(/* omitted for brevity */);
    }
  }
}

Easy. But how do we make this reusable? We don’t want people to have to copy and paste entire blocks of code. It would be nicer if it can just be ‘plugged in’.

Targets

In my previous article, I showed how NUKE’s build components make it easy to assemble a build script using only the components that you need. Adding a new target to the system is trivial, as is wiring it up to other targets.

Since we want our target to execute after the build finishes, we can just tack it onto our ‘main’ targets, right? Note that I’m simplifying the type structure for brevity.

class Build: NukeBuild
{
  Target BuildRelease =>
    _ => _.Executes(() => { });

  Target Release =>
    _ => _.Executes(() => { });

  Target NotifyFailureOnSlack =>
    _ => _.AssuredAfterFailure() // always run, regardless of other targets failing
          .TryTriggeredBy<Build>(b => b.BuildRelease, b => b.Release)
          .Executes(
            () => 
            {
              // notify if build failed
            }
          )
}

Unfortunately, things are not that simple. The first obstacle is detecting the build has failed. IsSuccessful won’t work, because it always returns false, since the build is still in progress. You’d need to look at ExecutingTargets to see if any of them has failed.

There’s a much bigger problem, though. NUKE can execute targets in any order it wants, unless you’ve specifically told it to execute a particular target before or after another one. This means if BuildRelease itself also triggers a different target, it could end up being run after NotifyFailureOnSlack. If that target then fails, it will go unnoticed. This is clearly not the right solution.

Attributes

An aspect of NUKE that is not documented at all on the official website is build extension attributes. When you setup a NUKE project, you’ll notice that the generated Build class is decorated with a bunch of different attributes.

[CheckBuildProjectConfigurations]
[ShutdownDotNetAfterServerBuild]
[DotNetVerbosityMapping]
class Build: NukeBuild
{
}

It turns out that these attributes are derived from a class called BuildExtensionAttributeBase and that NUKE has code in place to recognize attributes that inherit from it as, well, extensions to the build class. There’s also a bunch of interfaces that represent ‘events’, such as IOnBuildFinished that, when implemented on your build or a build extension, lets that component respond to build events. OnBuildFinished should sound familiar. Let us see what this looks like.

public class NotifyFailureOnSlackAttribute: BuildExtensionAttributeBase, IOnBuildFinished
{
  public void OnBuildFinished(NukeBuild build)
  {
    if (!build.IsSuccessful)
    {
      // notify
      Logger.Error($"{nameof(NotifyFailureOnSlackAttribute)}: Build failed!");
    }
  }
}

Does it work?

╬════════════
║ Failure
╬═══

> /usr/local/share/dotnet/dotnet build DoesNotExist.csproj --verbosity Minimal
Microsoft (R) Build Engine version 16.10.1+2fd48ab73 for .NET
Copyright (C) Microsoft Corporation. All rights reserved.

MSBUILD : error MSB1009: Project file does not exist.
Switch: DoesNotExist.csproj
ProcessException: Process 'dotnet' exited with code 1.
   > /usr/local/share/dotnet/dotnet build DoesNotExist.csproj --verbosity Minimal
   @ /Users/e.heemskerk/Git/Prive/NukeTest/build

   at Nuke.Common.Tooling.ProcessExtensions.AssertZeroExitCode(IProcess process)
   at Nuke.Common.Tools.DotNet.DotNetTasks.DotNetBuild(DotNetBuildSettings toolSettings)
   at Nuke.Common.Tools.DotNet.DotNetTasks.DotNetBuild(Configure`1 configurator)
   at Build.<>c.<get_Failure>b__2_1() in /Users/e.heemskerk/Git/Prive/NukeTest/build/Build.cs:line 21
   at Nuke.Common.Execution.TargetDefinition.<>c__DisplayClass61_0`1.<Executes>b__0()
   at Nuke.Common.Execution.BuildExecutor.<>c.<Execute>b__4_0(Action x)
   at Nuke.Common.Utilities.Collections.EnumerableExtensions.ForEach[T](IEnumerable`1 enumerable, Action`1 action)
   at Nuke.Common.Execution.BuildExecutor.Execute(NukeBuild build, ExecutableTarget target, IReadOnlyCollection`1 previouslyExecutedTargets, Boolean failureMode)

Repeating warnings and errors:
MSBUILD : error MSB1009: Project file does not exist.
ProcessException: Process 'dotnet' exited with code 1.
   > /usr/local/share/dotnet/dotnet build DoesNotExist.csproj --verbosity Minimal
   @ /Users/e.heemskerk/Git/Prive/NukeTest/build

═══════════════════════════════════════
Target             Status      Duration
───────────────────────────────────────
Failure            Failed        < 1sec
───────────────────────────────────────
Total                            < 1sec
═══════════════════════════════════════

Build failed on 07/03/2021 14:10:17. (╯°□°)╯︵ ┻━┻
NotifyOnFailureAttribute: The build failed!

Error messages

While this works, the message isn’t very useful. It doesn’t tell you what the failure was, or in which build it occurred. Let’s remedy that.

Unfortunately, getting error messages from NUKE isn’t that simple. Typically you’d do this by (also) writing log messages to an in-memory logger, which lets you sift through the messages it captured at your leisure. However, there is no public method or property that lets you do this; logging sinks are automatically configured by NUKE through an internal property.

You might have noticed the build output of our first attempt mentioned Repeating warnings and errors. This must mean it’s being stored somewhere, so maybe we can take advantage of that as well.

It turns out we can. The abstract class OutputSink has an internal field called SevereMessages, which is where warnings and errors are copied to as the build is running. How do you get an OutputSink? The Logger class has an internal static field called OutputSink. Given all these fields are internal, using reflection seemed the simplest approach to getting these messages out of there:

internal static class LoggerUtility
{
  public static OutputSink GetOutputSink() => (OutputSink) _outputSinkField.GetValue(null);

  static LoggerUtility()
  {
    _outputSinkField = typeof(Logger).GetField(_outputSinkFieldName, BindingFlags.Static | BindingFlags.NonPublic)
                    ?? throw new InvalidOperationException($"Couldn't find '{_outputSinkFieldName}' field on type {nameof(Logger)}.");
  }

  private static readonly FieldInfo _outputSinkField;
  private const string _outputSinkFieldName = "OutputSink";
}

internal static class OutputSinkExtensions
{
  public static string GetErrorMessages(this OutputSink sink)
  {
    var messages = (List<Tuple<LogLevel, string>>) _severeMessagesField.GetValue(sink);

    return string.Join(
      System.Environment.NewLine,
      messages!.Where(m => m.Item1 == LogLevel.Error)
               .Select(t => t.Item2)
    );
  }

  static OutputSinkExtensions()
  {
    _severeMessagesField = typeof(OutputSink).GetField(_severeMessagesFieldName, BindingFlags.Instance | BindingFlags.NonPublic)
                        ?? throw new InvalidOperationException($"Couldn't find '{_severeMessagesFieldName}' field on type {nameof(OutputSink)}.");
  }

  private static readonly FieldInfo _severeMessagesField;
  private const string _severeMessagesFieldName = "SevereMessages";
}

This should give us a relatively detailed description of what went wrong.

╬════════════
║ Failure
╬═══

> /usr/local/share/dotnet/dotnet build DoesNotExist.csproj --verbosity Minimal
Microsoft (R) Build Engine version 16.10.1+2fd48ab73 for .NET
Copyright (C) Microsoft Corporation. All rights reserved.

MSBUILD : error MSB1009: Project file does not exist.
Switch: DoesNotExist.csproj
ProcessException: Process 'dotnet' exited with code 1.
   > /usr/local/share/dotnet/dotnet build DoesNotExist.csproj --verbosity Minimal
   @ /Users/e.heemskerk/Git/Prive/NukeTest/build

[...]

═══════════════════════════════════════
Target             Status      Duration
───────────────────────────────────────
Failure            Failed        < 1sec
───────────────────────────────────────
Total                            < 1sec
═══════════════════════════════════════

Build failed on 07/03/2021 14:10:17. (╯°□°)╯︵ ┻━┻
NotifyOnFailureAttribute: MSBUILD : error MSB1009: Project file does not exist.
ProcessException: Process 'dotnet' exited with code 1.
   > /usr/local/share/dotnet/dotnet build DoesNotExist.csproj --verbosity Minimal
   @ /Users/e.heemskerk/Git/Prive/NukeTest/build

As for which build the error occurred in, the available information differs between build servers. We use TeamCity, so we can get the server URL and the build ID from NUKE’s built-in TeamCity class, which allows us to construct a direct URL to the build log of the failed build. Different CI systems might expose different information, and there might be more information available in environment variables than what is exposed by NUKE.

Conclusion

While it takes a little bit of work, you can get notifications directly from NUKE into Slack, which saves you from having to babysit your CI system each time you merge a change.

Here’s an actual notification (obviously edited): A failure notification in Slack