RNGTest design

This page collects some architecture details and software design notes from the time while RNGTest was converted from OpenMPI to Charm++ parallelism.

Requirements, features

1. Works with current and future libraries of RNGs

Currently Intel's MKL's vector statistical library (VSL) RNGs, RNGSSE's RNGs, and a couple of Random123 RNGs are interfaced. References:

Note that MKL and RNGSSE2 are entirely optional. If unavailable, all features of Quinoa can be used via Random123.

All of these libraries implement a variety of different pseudo random number generators and importantly, all support concurrent sampling from random number streams: MKL supports both block splitting and leap frogging, RNGSSE supports block splitting (jumping ahead and initialization of a large number of independent streams), and Random123 RNGs are counter-based so counters can be initialized and incremented differently on each rank/thread.

2. Works with different libraries of statistical tests

Currently TestU01's SmallCrush, Crush, and BigCrush batteries are supported and the design allows for other libraries of statistical tests. Reference for TestU01: http://www.iro.umontreal.ca/~simardr/testu01/tu01.html

3. The batteries work in parallel

The user selects a number of RNGs (with optionally specifying parameters for each, e.g., seed, sequence length, etc.) as well as a battery of statistical tests. The tests are then run concurrently. RNGs from all RNG libraries can be tested (and thus compared to each other) at the same time.

4. Generator quality and computational cost

As the tests of the battery are run, the various RNGs are timed separately, providing a 'generator cost' ranking. The number of passed and failed tests provide a 'generator quality' ranking (if more than one RNGs are tested).

5. Concurrency

An arbitrary number of RNGs can be subjected to a single battery at a time. The tests are run concurrently using Charm++ on a single machine or across a set of machines distributed over the network. The software design is fully asynchronous yielding 100% CPU utilization at all times, independent of the time taken by the individual tests.

Concurrency via Charm++ requires serializable objects. Charm++ discourages pointers, function pointers, references, while encourages stateless objects (or objects with as little and simple state as possible).

Inheritance and runtime polymorphism

While Charm++ supports migratable objects (chares) that inherit from a pure base class to implement runtime polymorphism via reference semantics, the current design of RNGTest does not rely on Charm++ to do this. Instead we use Sean Parent's concept-based polymorphism which achieves polymorphism without client-side inheritance and enables value semantics. In concept-based polymorphism inheritance in confined to the internals of a "base" class, which, at instantiation, is initialized via a templated constructor by a "derived"-class object. This locality enables an easier reasoning about the code, eliminates the need for client-side heap-allocation, client-side indirection, and in general, leads to tighter and more readable, simpler client-code.

Battery rngtest::TestU01Suite is thus used polymorphically with rngtest::Battery. All polymorphism (in a classical OOP sense) is confined to the internals of class Battery. A client still uses rngtest::TestU01Suite polymorphically as a rngtest::Battery.

Analogous to the relationship between classes rngtest::Battery and rngtest::TestU01Suite, the random number generators, available from the MKL, RNGSSSE, and Random123 libraries, are also used polymorphically with class tk::RNG via concept-based polymorphism.

Analogous to the relationship between rngtest::Battery and rngtest::TestU01Suite, individual statistical tests of TestU01 are also used polymorphically with the "base" class rngtest::StatTest.

Test suite and test configuration

rngtest::TestU01Suite is configured at instantiation by selecting a TestU01 suite, one of rngtest::SmallCrush, rngtest::Crush, or rngtest::BigCrush.

Since the individual statistical tests of TestU01 are very similar in structure, but have some differences, e.g., the C library function used to create the results struct, the number and type of parameters, etc., which require non-trivial initialization, i.e., function pointers and variable number of test parameters, and rngtest::TestU01 is a Charm++ chare, the structure rngtest::TestU01 is kept minimal, simple, and generic. This is facilitated by templating on the properties class, rngtest::TestU01Props, containing the configuration for a given test. Since there is a large number of tests in each rngtest::TestU01 battery, a great amount of code can be reused by templating rngtest::TestU01 which still inherits (in a concept-based polymorphism sense) from rngtest::StatTest.

As discussed above, there are three different dimensions of runtime polymorphism in RNGTest: (1) the RNGs, (2) the statistical tests, and (3) the test suites, all exercising concept-based polymorphism.

Instantiation using factories

In those cases where the instantiation of the object (i.e., which object to instantiate) depends on user input (RNGs and test suites), polymorphism is facilitated by factories. Factories are std::maps (associative containers) that hold function objects and enable lookup based on a key, e.g., an enum based on user input. Function objects are essentially smart function pointers, holding various derived-class constructors and their constructor arguments. These function pointers and their arguments (i.e., how to invoke them) are stored by std::function, but the object is not instantiated at the time it is registered into the factory. The object is instantiated after a lookup (based on a key), after which the constructor is called thereby instantiating the derived object. For RNGs this map is tk::RNGFactory, for test suites it is rngtest::BatteryFactory.

The factories are implemented using maps via either reference or value semantics. Since concept-based polymorphism enables value semantics, value semantics is used in factories whenever possible. For more information on Boost.factory, see http://www.boost.org/doc/libs/release/libs/functional/factory.

Instantiation without factories

In the case when all registered objects have to be instantiated, there is no need for a factory, and the base class pointers are initialized by instantiating derived objects held in a std::vector. This is the case for statistical tests held by, e.g., TestU01Suite.

Charm++ design

Before porting to Charm++ RNGTest used OpenMP for concurrency. The last commit before porting to Charm++ is bfcab8f5 and the commit where the Charm++ port is fully functional is c504a51b. Note that during Charm++ port of rngtest target quinoa does not fully build.

The discussion below details only some of the details encountered during the Charm++ port of RNGTest. More documentation can be found in the source code itself as well as the individual (sometimes rather lenghty) commit messages between the two commits mentioned above.

Preliminary considerations

The above features, facilitating polymorphism, code reuse, e.g., factories, etc., are certainly desirable to keep in a Charm++ port of rngtest. However, the Charm++ implementation imposes additional challenges, especially in the view of the constrains on the global-scope wrappers (used as library-external calls). Some of the following challenges have been identified before porting to Charm++.

  • Global-scope trickery, especially with the polymorphic base class RNG pointers, is a challenge as any global-scope object in Charm++ must be initialized in the main chare and must be serializable so that the Charm++ runtime system can migrate them across the network. This means that the vector of base RNG pointers should be re-creatable after migration. This may also mean that its factory must also be in global scope and migratable. How to migrate std::function is a question. As it turns out Sean Parent's concept-based polymorphism (discussed above) significantly simplifies runtime polymorphism using Charm++: since polymorphism is kept internal to a non-chare class, e.g., rngtest::StatTest, and the client-code can use value-semantics, the "derived" class rngtest::TestU01 chare does not need Charm++'s machinery facilitating runtime polymorphism using the PUP::able framework. Though in the final design Charm++'s runtime polymorphism is not used, a working example is in commit 56bdcf73.
  • Conveniently holding references to Base is hardly an option with Charm++ objects that should be migratable. Making Base migratable is not a viable option and seems wasteful when only a small part of the data is needed by a class. Passing and storing only what's needed certainly seems like a more walkable route.
  • All migratable objects must have as little state as possible. This is for reasons of code complexity (less PUP routines to write and maintain), as well as computational cost (less data to migrate).
  • Converting each statistical test in a suite seems like the most straightforward to start with when porting to Charm++. However, if class rngtest::TestU01 holds nontrivial state, i.e., custom classes, they all require custom PUP routines and must also be polymorphic with base rngtest::StatTest. Furthermore, once a test was finished there should be a way to pass the control flow back to the invoking suite to evaluate the just-finished test. Does this require the suite be a chare object as well? (That is also polymorphic and holds some nontrivially migratable state.) How much does Charm++ help already with these issues? E.g., PUP routines for STL containers exist, but some are not optimal and don't use the latest C++ standard. As described below, most of these issues are eliminated by a careful design of classes with as little state as possible.

Design overview

The Charm++ design of RNGTest relies on three interacting Charm++ modules:

  1. rngtest – Beside global-scope data, this file describes the mainchare Main and a small helper chare, execute, both defined in Main/RNGTest.C.
  2. testu01suite – This file describes the chare rngtest::TestU01Suite.
  3. testu01 – This file describes the chare rngtest::TestU01, which is templated on the test properties, and thus lists all possible template specializations, corresponding to the various statistical tests available in the TestU01 library.

Read-only global-scope data

Background. In RNGTest both the RNGs and the statistical test suites are provided by third-party libraries. In other words, rngtest is only a glue-code that ties together the RNGs and the statistical tests. Note that neither third-party libraries (neither the test suites nor the RNGs) are aware of Charm++'s fully asynchronous nature. This constrains how the RNGs must be called by the batteries as the RNGs must be passed, as external generators, to the statistical test suites. Since the currently only supported set of batteries (TestU01) are a C library, this is done through global-scope wrappers.

Global-scope wrappers. The RNGs, wrapped through the base tk::RNG and children tk::MKLRNG, tk::RNGSSE, tk::Random123, are invoked via concept-based runtime polymorphism (discussed above). While the RNGs are instantiated via tk::RNGFactory (discussed above), to be able to call any (or all) of the instantiated RNGs at the same time (to facilitate concurrency), we need as many wrappers as the maximum number of RNGs, e.g., 20+. In principle, this requires 20+ global-scope almost identical wrappers so that we can pass them into TestU01. Note that the function signature TestU01 accepts for an external generator is the same for all RNGs. Templating the global-scope wrappers on an integer, however, allows to write the wrapper only once and let the compiler generate the 20+ slightly different wrappers, differing only in which polymorphic RNG needs to be called by TestU01. This requires the global-scope wrappers to rely on a global-scope vector (or map, see below) of polymorphic RNGs and a global-scope id, as the function signature TestU01 accepts does not allow this information (which RNG I want it to call from some list and which parallel stream I want the next number from). The current solution thus holds a global-scope std::map of polymorphic tk::RNG objects (using value semantics) into which the global-scope wrappers index into, using an std::map::find based on a C-style enum (integer) template argument. This works well enabling code-reuse. See RNGTest/TestU01Wrappers.h. For more information on why some of this (and other) data is global-scope, see the in-code documentation in Main/RNGTest.C.