/ .NET

How are we going to run this thing?

Before I get too involved with any additional work on Bobo, I need to address something pretty fundamental -- how am I going to run and/or test it?

The original Daxat Enterprise Search Engine (ESS) didn't have any unit tests. It did have a bunch of 'full-system integration tests', as it were -- basically, jobs which created massive indexes and verified them. For example, one of the tests indexed all of the contents of Project Gutenberg; another indexed all of my personal email. There were also a bunch of smaller test utilities so I could run various types of performance tests, query tests, etc. Unfortunately, all of those tests rely on a fairly complete search engine.

When the engine was first being written, it was possible to simply run parts of it -- that was the nature of how I wrote it. But as the code base grew, those short little hooks into the system slowly got coded away.

And, given that this is a porting effort, I'm electing to start the porting process with some fairly deep, low-level parts of the architecture. Which is somewhat backwards considering the order in which the code was originally written. As such, there really isn't a good existing way to run and test the parts of the code I am porting.

There's also an odd directory structure to the original source code, a consequence for how Visual Studio deals with Solutions and Projects. For example, the BTree code is in the ESS\ESS\Daxat.Ess.Engine\Indexers\BTreeIndexers directory. I think the two top level folders can be replaced with src. And, in doing so, we can also create a peer directory named test and work well with .NET Core's testing capabilities (which seem to really like src\Project and test\Project.Tests and seem to hate most other directory layouts).

With these changes in place, we can start to make the move off of Visual Studio 2015 and onto Visual Studio Code. But, more importantly, we can make the move off of .NET Framework and onto .NET Core. But what does it really take to do that?

Well, the first step is to go to the source code directory and set up a .NET Core class library project (since the original code is a class library). We then run dotnet new but with the --type lib option to properly set up project.json for a class library.

C:\Users\todd_\Source\Repos\Bobo\src\Daxat.Ess.Engine>dotnet new --type lib
Created new C# project in C:\Users\todd_\Source\Repos\Bobo\src\Daxat.Ess.Engine.

This creates a default Library.cs file which we don't need, so we can simply delete it. We then run code . from the source directory, and let VS Code do its set-up magic.

C:\Users\todd_\Source\Repos\Bobo\src\Daxat.Ess.Engine>del Library.cs

C:\Users\todd_\Source\Repos\Bobo\src\Daxat.Ess.Engine>code .

vscode build/debug assets

The main problem now is that, although we are just focusing on a few select .cs files in the Indexers folder, we've effectively picked up all of the original code for this class library. Unsurprisingly, CTRL+SHIFT+B to build the project results in hundreds (thousands?) of errors. In the current state, this code will not compile until all of it is happy.

One way to deal with this is to exclude files from the build process via the project.json file. For example, the following change eliminates all of the compilation errors outside of Indexers\BTreeIndexers, although there are still hundreds of build errors:

  "buildOptions": {
    "debugType": "portable",
    "compile": {
      "exclude": [
            "Documents/**/*",
            "Enumerators/**/*",
            "Grammar/**/*",
            "Indexers/FullTextIndexers/**/*",
            "Indexers/InvertedWordLists/**/*",
            "Indexers/Nodes/**/*",
            "Indexers/SuffixTrees/**/*",
            "Indexers/UniqueWords/**/*",
            "Literals/**/*",
            "Metadata/**/*",
            "Scoring/**/*",
            "Utils/**/*",
            "*.csproj*",
            "*.vspscc",
            ".cvsignore",
            "AssemblyInfo.cs",
      ]
    }
  },

We can be even more aggressive and include many of the individual .cs files in the BTreeIndexers folder, removing them from the exclusion as we start to work on them. Doing so gets us down to 4 warnings and 38 errors, which feels manageable. But most of the errors are because we are now using types which do not exist.

Clearly, we are going to need some kind of mocking framework to get us through this effort (and stubs, and fakes...) More on that next time.