Representing C/C++ unions and bitfields in C#



You are a seasoned C++ applications or embedded programmer, and you need to access an integer bitfield as a set of specific bits. You know how to do this:

union myUnion
{
 unsigned int info;
 struct
 {
  unsigned int flag1 : 1; // bit 0
  unsigned int flag2 : 1; // bit 1
  unsigned int flag3 : 1; // bit 2
  unsigned int flag4 : 1; // bit 3
  unsigned int flag5 : 1; // bit 4
  unsigned int flag6 : 1; // bit 5
  // .
  // .
  // .
  unsigned int flag31 : 1; // bit 31
 };
};
Supposing, however, you are a C#/.NET programmer. What do you do then? There is no provision or direct support in the language to do this.
What tools do you have at your disposal?
Well, you do have:

[StructLayout(LayoutKind.Explicit, Size = 1, CharSet = CharSet.Ansi)]

That will give you the ability to control the byte packing. You also have:

[FieldOffset(x)]

            Where: x is your byte offset

That will give you your union.

What else are we missing? What we need is the bitfield operator. That we get from the BitVector32
class. There are however, a number of things we need to keep in mind:
  • We need to explicitly specify the size of the variables that will contain the data in question. This first item is very important to pay attention to. In the union above, for example, we specify info as unsigned int. What size is unsigned int? 32 bits? 64 bits? That depends on the version of the OS where this particular version of .NET is running (this may be .NET Core, Mono, or Win32/Win64).
  • What 'endianness ' or bit order is this? Again, we may be running on any type of hardware (think Mobile/Xamarin, not just laptop or tablet) -- as such, we cannot assume Intel bit order.
  • We will want to avoid any memory management that is language dependent, or in C/C++ parlance POD (plain old data) types. That will mean sticking to value types only.

Lets go ahead and make the assumption, for this example, that sizeof(int) == 32.
The trick, then, is to ensure the following:
  • All data is byte aligned.
  • The bitfields and info field align on the same byte boundary.

Here is how we accomplish that:

[StructLayout(LayoutKind.Explicit, Size = 1, CharSet = CharSet.Ansi)]
public struct MyUnion
{
 #region Lifetime
 
 /// <summary>
 /// Ctor
 /// </summary>
 /// <param name="foo"></param>
 // ReSharper disable once UnusedParameter.Local
 public MyUnion(int foo)
 {
  // allocate the bitfield
  info = new BitVector32(0);
 
  // initialize bitfield sections
  flag1 = BitVector32.CreateSection(1);
  flag2 = BitVector32.CreateSection(1, flag1);
  flag3 = BitVector32.CreateSection(1, flag2);
 }
 
 #endregion
 
 #region Bifield
 
 // Creates and initializes a BitVector32.
 [FieldOffset(0)]
 private BitVector32 info;
 
 #endregion
 
 #region Bitfield sections
 
 /// <summary>
 /// Section - Flag 1
 /// </summary>
 private static BitVector32.Section flag1;
 
 /// <summary>
 /// Section - Flag 2
 /// </summary>
 private static BitVector32.Section flag2;
 
 /// <summary>
 /// Section - Flag 3
 /// </summary>
 private static BitVector32.Section flag3;
 
 #endregion
 
 #region Properties
 
 /// <summary>
 /// Flag 1
 /// </summary>
 public bool Flag1
 {
  get { return info[flag1] != 0; }
  set { info[flag1] = value ? 1 : 0; }
 }
 
 /// <summary>
 /// Flag 2
 /// </summary>
 public bool Flag2
 {
  get { return info[flag2] != 0; }
  set { info[flag2] = value ? 1 : 0; }
 }
 
 /// <summary>
 /// Flag 1
 /// </summary>
 public bool Flag3
 {
  get { return info[flag3] != 0; }
  set { info[flag3] = value ? 1 : 0; }
 }
 
 #endregion
 
 #region ToString
 
 /// <summary>
 /// Allows us to represent this in human readable form
 /// </summary>
 /// <returns></returns>
 public override string ToString()
 {
  return $"Name: {nameof(MyUnion)}{Environment.NewLine}Flag1: {Flag1}: Flag2: {Flag2} Flag3: {Flag3}  {Environment.NewLine}BitVector32: {info}{Environment.NewLine}";
 }
 
 #endregion
}

Pay particular attention to the constructor. By definition, you cannot define a default constructor for structs in C#. However, we need some way to ensure that the BitVector32 object and its sections are initialized properly before use. We accomplish that by requiring a constructor that takes a dummy integer parameter, and initialize the object like this:

/// <summary>
/// Main entry point
/// </summary>
/// <param name="args"></param>
static void Main(string[] args)
{
 var myUnion1 = new MyUnion(0)
 {
  Flag2 = true
 };
By the way, you aren't by any means limited to single bitfields -- you can define any size bitfield you like. For example, if I were to change the example to be:

union myUnion2
{
 unsigned int info;
 struct
 {
  unsigned int flag1 : 3; // bit 0-2
  unsigned int flag2 : 1; // bit 3
  unsigned int flag3 : 4; // bit 4-7
  unsigned int flag4 : 1; // bit 8
  // .
  // .
  // .
  unsigned int flag31 : 1; // bit 31
 };
};

I would just change my class to be:
[StructLayout(LayoutKind.Explicit, Size = 1, CharSet = CharSet.Ansi)]
public struct MyUnion2
{
 #region Lifetime
 
 /// <summary>
 /// Ctor
 /// </summary>
 /// <param name="foo"></param>
 public MyUnion2(int foo)
 {
  // allocate the bitfield
  info = new BitVector32(0);
 
  // initialize bitfield sections
  flag1 = BitVector32.CreateSection(0x07);
  flag2 = BitVector32.CreateSection(1, flag1);
  flag3 = BitVector32.CreateSection(0x0f, flag2);
 }
 
 #endregion
 
 #region Bifield
 
 // Creates and initializes a BitVector32.
 [FieldOffset(0)]
 private BitVector32 info;
 
 #endregion
 
 #region Bitfield sections
 
 /// <summary>
 /// Section - Flag1
 /// </summary>
 private static BitVector32.Section flag1;
 
 /// <summary>
 /// Section - Flag2
 /// </summary>
 private static BitVector32.Section flag2;
 
 /// <summary>
 /// Section - Flag3
 /// </summary>
 private static BitVector32.Section flag3;
 
 #endregion
 
 #region Properties
 
 /// <summary>
 /// Flag 1
 /// </summary>
 public int Flag1
 {
  get { return info[flag1]; }
  set { info[flag1] = value; }
 }
 
 /// <summary>
 /// Flag 2
 /// </summary>
 public bool Flag2
 {
  get { return info[flag2] != 0; }
  set { info[flag2] = value ? 1 : 0; }
 }
 
 /// <summary>
 /// Flag 1
 /// </summary>
 public int Flag3
 {
  get { return info[flag3]; }
  set { info[flag3] = value; }
 }
 
 #endregion
 
 #region ToString
 
 /// <summary>
 /// Allows us to represent this in human readable form
 /// </summary>
 /// <returns></returns>
 public override string ToString()
 {
  return $"Name: {nameof(MyUnion)}{Environment.NewLine}Flag1: {Flag1}: Flag2: {Flag2} Flag3: {Flag3}  {Environment.NewLine}BitVector32: {info}{Environment.NewLine}";
 }
 
 #endregion
}

One last word on this topic... it should be obvious that this should only be done if you absolutely MUST do this. Clearly, this requires specialized knowledge of your OS environment, the language you are running, your calling convention, and a host of other brittle requirements.

In any other context there are so many code smells here that this clearly screams of non-portability. But from the context of the problem we are trying to solve, since the whole point is to run close to the hardware and we need this kind of precision, I think the ends do justify the means.

Caveat emptor!

Sample source (CSharpBitfields) can be found in my Git repository here:
SampleSource

Comments

Popular posts from this blog

Implementing the try/catch/finally pattern in C++ using lambdas, or why cant this be as easy as it is in C#?

Where it all began... or, "in my day, we didn't have '1's and '0's... we only had '1's"