I was wondering for quite a while how much speed-up one can roughly expect by a policy*Modern C++ Design, A. Alexandrescu, 2001 based design of algorithms in C++ using templates compared to the strategy*Design Patterns, E. Gamma, R. Helm, R. Johnson, and J. Vlissides, 1995 pattern using polymorphism. Thereby, I am especially interested in algorithms for image processing.

Strategy vs. Policy in C++

Both, policy and strategy, can be used to modularize algorithms. For more details I refer to the textbooks listed below. To compare both approaches exclusively in terms of speed, I created a simple example. I consider the binary operations addition and subtraction of two images pixel-wise and in place for all image pixels i. For the polymorphism based strategy pattern I use an abstract super class. ```cpp template struct BinaryOperator { virtual t_data sBinOperation(t_data a, t_data b) = 0; }; template struct Add : BinaryOperator { t_data sBinOperation(t_data a, t_data b) { return a+b; } }; template struct Subtract : BinaryOperator { t_data sBinOperation(t_data a, t_data b) { return a-b; } };

Further, I have a class that iterates over the image and applies the binary operation pixel-wise. The binary operation is contained as a polymorphic member.
```cpp
template <class t_data>
struct ImBinaryOperatorStrategy
{
    BinaryOperator<t_data> *binOp;
    ImBinaryOperatorStrategy(BinaryOperator<t_data> *binOp):binOp(binOp){}
    void imBinOperation(t_data* im1, t_data* im2, int width, int height, int step1,
                        int step2)
    {
        for (int y=0; y<height; y++)
            for (int x=0; x<width; x++)
                im1[x+y*step1] = binOp->sBinOperation(im1[x+y*step1],im2[x+y*step2]);
    }
};

For the policy based design with templates, I use the classes Add and Subtract (but not their base class BinaryOperator) and the following class to operate on images. In contrast to the strategy pattern, the binary operation is not a member but passed as template parameter.

template <template <class> class t_binOp, class t_data>
struct ImBinaryOperatorPolicy : t_binOp<t_data>
{
    void imBinOperation(t_data* im1, t_data* im2,int width, int height, int step1,
                        int step2)
    {
        for (int y=0; y<height; y++)
            for (int x=0; x<width; x++)
                im1[x+y*step1] =
                    t_binOp<t_data>::sBinOperation(im1[x+y*step1],im2[x+y*step2]);
    }
};

For instance, I write in case of the addition

Add<float> add;
ImBinaryOperatorStrategy<float> addOpStrategy(&add);
addOpStrategy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);

for the strategy pattern and

ImBinaryOperatorPolicy<Add, float> addOpPolicy;
addOpPolicy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);

for the template based policy pattern.

Speed Comparisons

Using OpenCV 2.4.9, I loaded the gray-scale 512x512 test image and added and subtracted the image from itself 10000 times with the policy based design and 10000 times with the strategy pattern. On my quite old Intel Core 2 Duo processor the policy based design was about 1.24 times faster than the strategy based design with no gcc optimization. However, if we switch on compiler optimization in terms of the CMake option -DCMAKE_BUILD_TYPE=RELEASE the speed-up is increased to a factor of ca. 4.13. The CMake option uses the gcc flags -O3 -DNDEBUG. If I replace addition and subtraction with max and min, the speed-up decreases to ca. 1.76. If I use the Christoph-Operators where I replace $a+b$ with $\sqrt{a+b}\sqrt{a+b}$ and analogously for minus, the speed-up is 1.02. On Christoph's i7, policy is slower than strategy for these operators and the speed-up factor is 0.78. All results including the time measurements of a third machine with an Intel Xeon CPU are summarized in the following table.

method	machine	opt.	speed-up	policy time [s]	strategy time [s]
add/sub	Core 2 Duo	no	1.24	51.67	64.05
add/sub	Core 2 Duo	yes	4.13	5.14	21.27
max/min	Core 2 Duo	yes	1.76	12.03	21.17
Christoph’s add/sub	Core 2 Duo	yes	1.02	44.29	45.23
add/sub	i7	yes	3.69	3.41	12.59
max/min	i7	yes	2.31	5.44	12.58
Christoph’s add/sub	i7	yes	0.78	41.01	32.1
add/sub	Xeon	yes	4.04	2.64	10.66
max/min	Xeon	yes	1.55	17.81	27.69
Christoph’s add/sub	Xeon	yes	1.5	18.52	27.86

The complete program is depicted subsequently.

#include <opencv2/opencv.hpp>
#include <string>

using namespace cv;
using namespace std;
// This base class is needed for strategy
template <class t_data>
struct BinaryOperator
{
    virtual t_data sBinOperation(t_data a, t_data b) = 0;
};
// That Add inherits from BinaryOperator is only necessary for strategy not for 
// policy
template<class t_data>
struct Add : BinaryOperator<t_data>
{
    t_data sBinOperation(t_data a, t_data b)
    {
        return a+b;
    }
};
// That Subtract inherits from BinaryOperator is only necessary for strategy not for 
// policy
template<class t_data>
struct Subtract : BinaryOperator<t_data>
{
    t_data sBinOperation(t_data a, t_data b)
    {
        return a-b;
    }
};
// That Maximum inherits from BinaryOperator is only necessary for strategy not for 
// policy
template<class t_data>
struct Maximum : BinaryOperator<t_data>
{
    t_data sBinOperation(t_data a, t_data b)
    {
        return max(a,b);
    }
};
// That Minimum inherits from BinaryOperator is only necessary for strategy not for 
// policy
template<class t_data>
struct Minimum : BinaryOperator<t_data>
{
    t_data sBinOperation(t_data a, t_data b)
    {
        return min(a,b);
    }
};
// That ChristophOperator inherits from BinaryOperator is only necessary for strategy 
// not for policy
template<class t_data>
struct ChristophAdd : BinaryOperator<t_data>
{
    t_data sBinOperation(t_data a, t_data b)
    {
        return sqrt(a+b)*sqrt(a+b);
    }
};
// That ChristophSubtract inherits from BinaryOperator is only necessary for strategy 
// not for policy
template<class t_data>
struct ChristophSubtract : BinaryOperator<t_data>
{
    t_data sBinOperation(t_data a, t_data b)
    {
        return sqrt(a-b)*sqrt(a-b);
    }
};
// Implements binary operations on images using policy based design
template <template <class> class t_binOp, class t_data>
struct ImBinaryOperatorPolicy : t_binOp<t_data>
{
    inline void imBinOperation(t_data* im1, t_data* im2,int width, int height, 
                               int step1, int step2)
    {
        for (int y=0; y<height; y++)
            for (int x=0; x<width; x++)
                im1[x+y*step1] = t_binOp<t_data>::sBinOperation(im1[x+y*step1],im2[x+y*step2]);
    }
};
// Implements binary operations on images using stragey based design
template <class t_data>
struct ImBinaryOperatorStrategy
{
    BinaryOperator<t_data> *binOp;
    ImBinaryOperatorStrategy(BinaryOperator<t_data> *binOp):binOp(binOp){}
    void imBinOperation(t_data* im1, t_data* im2,int width, int height, int step1, 
                        int step2)
    {
        for (int y=0; y<height; y++)
            for (int x=0; x<width; x++)
                im1[x+y*step1] = binOp->sBinOperation(im1[x+y*step1],im2[x+y*step2]);
    }
};

int main()
{
    /*
       Loading image and converting to float using OpenCV
    */
    string inFileName("im.png");
    Mat im;
    im = imread( inFileName, 0 );
    Mat im_f(im.rows, im.cols, CV_32F);
    Mat imTmp_f(im.rows, im.cols, CV_32F);
    im.convertTo(im_f,CV_32F);
    im_f.copyTo(imTmp_f);
    int rows = im_f.rows;
    int cols = im_f.cols;
    int step = im_f.step1();
    int stepTmp = imTmp_f.step1();
    float *data = reinterpret_cast<float*> (im_f.data);
    float *dataTmp = reinterpret_cast<float*> (imTmp_f.data);
    int numExperiments = 10000;

    /*
       Policy addition subtraction
    */
    ImBinaryOperatorPolicy<Add, float> addOpPolicy;
    ImBinaryOperatorPolicy<Subtract, float> subtractOpPolicy;
    double t1 = (double)getTickCount();
    for (int i=0; i<numExperiments; i++)
    {
        addOpPolicy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);
        subtractOpPolicy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);
    }
    t1 = ((double)getTickCount() - t1)/getTickFrequency();
    cout << "Time passed in seconds using policy: " << t1 << endl;
    /*
       Strategy addition subtraction
    */
    Add<float> add;
    Subtract<float> sub;
    ImBinaryOperatorStrategy<float> addOpStrategy(&add);
    ImBinaryOperatorStrategy<float> subtractOpStrategy(&sub);

    double t2 = (double)getTickCount();
    for (int i=0; i<numExperiments; i++)
    {
        addOpStrategy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);
        subtractOpStrategy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);
    }

    t2 = ((double)getTickCount() - t2)/getTickFrequency();
    cout << "Time passed in seconds using strategy: " << t2 << endl;

    cout << "Policy is about " << t2/t1 << 
        " times faster than strategy for addition and subtraction." << endl;
    /*
       Policy max/min
    */
    ImBinaryOperatorPolicy<Maximum, float> maxOpPolicy;
    ImBinaryOperatorPolicy<Minimum, float> minOpPolicy;
    t1 = (double)getTickCount();
    for (int i=0; i<numExperiments; i++)
    {
        maxOpPolicy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);
        minOpPolicy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);        
    }
    t1 = ((double)getTickCount() - t1)/getTickFrequency();
    cout << "Time passed in seconds using policy: " << t1 << endl;
    /*
       Strategy max/min
    */
    Maximum<float> max;
    Maximum<float> min;
    ImBinaryOperatorStrategy<float> maxOpStrategy(&max);
    ImBinaryOperatorStrategy<float> minOpStrategy(&min);
    t2 = (double)getTickCount();
    for (int i=0; i<numExperiments; i++)
    {
        maxOpStrategy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);
        minOpStrategy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);
    }

    t2 = ((double)getTickCount() - t2)/getTickFrequency();
    cout << "Time passed in seconds using strategy: " << t2 << endl;

    cout << "Policy is about " << t2/t1 << 
        " times faster than strategy for the pixel-wise max/min." << endl;
    /*
       Policy ChristophOp
    */
    ImBinaryOperatorPolicy<ChristophAdd, float> chaOpPolicy;
    ImBinaryOperatorPolicy<ChristophSubtract, float> chsOpPolicy;

    t1 = (double)getTickCount();
    for (int i=0; i<numExperiments; i++)
    {
        chaOpPolicy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);
        chsOpPolicy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);
    }
    t1 = ((double)getTickCount() - t1)/getTickFrequency();
    cout << "Time passed in seconds using policy: " << t1 << endl;
    /*
       Strategy ChristophOp
    */
    ChristophAdd<float> chaOp;
    ChristophSubtract<float> chsOp;
    ImBinaryOperatorStrategy<float> chaOpStrategy(&chaOp);
    ImBinaryOperatorStrategy<float> chsOpStrategy(&chsOp);

    t2 = (double)getTickCount();
    for (int i=0; i<numExperiments; i++)
    {
        chaOpStrategy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);
        chsOpStrategy.imBinOperation(data, dataTmp, rows, cols, step, stepTmp);
    }

    t2 = ((double)getTickCount() - t2)/getTickFrequency();
    cout << "Time passed in seconds using strategy: " << t2 << endl;

    cout << "Policy is about " << t2/t1 << 
        " times faster than strategy for the Christoph operator." << endl;
    return 0;
}

Output without compiler optimization only for addition and subtraction: Time passed in seconds using policy: 51.6706
Time passed in seconds using strategy: 64.0456
Policy is about 1.2395 times faster than strategy. Output with compiler optimization (-O3 -DNDEBUG): Time passed in seconds using policy: 5.14101
Time passed in seconds using strategy: 21.271
Policy is about 4.13752 times faster than strategy for addition and subtraction. Time passed in seconds using policy: 12.0251
Time passed in seconds using strategy: 21.1691
Policy is about 1.76041 times faster than strategy for the pixel-wise max/min. Time passed in seconds using policy: 44.2865
Time passed in seconds using strategy: 45.2317
Policy is about 1.02134 times faster than strategy for the Christoph operators.