VPI and DPI

Although SystemVerilog offers a set of powerful language primitives to satisfy normal usages, in many cases people wish to opt for more direct control over the simulation, retrospection on the design hierarchy at runtime, or even have finer granularity of assertion. As a result, in 2005 IEEE standardize the interface between the simulator and the C programming interface and name it Verilog Procedural Interface (VPI), originally known as PLI 2.0. VPI allows C functions to be invoked in behavioral RTL code and provides a set of events to which C code can register callback. Notice that VPI is part of Verilog standard and by virtually inheriting Verilog, SystemVerilog also supports VPI.

Although a powerful tool to use, VPI has couple limitations and SystemVerilog introduces Direct Programming INterface (DPI), which allows foreign languages such as C and C++ to directly interface with the simulator. Unlike VPI, DPI ensures compatibility and efficiency by providing an two separated layers (SystemVerilog layer and foreign language layer). We will discuss the similarity and differences in this chapter.

9.1 Verilog Procedural Interface

VPI offers a set of C functions that can be used to interact with the simulator:

Introspect the entire design hierarchy
Set callbacks to a set of simulation events such as when the simulation starts or when a signal value changes
Wrap a C function as used-defined system task called by test bench code.

In this chapter we will mainly focus on the first two features of VPI, since the last feature can be done in DPI most of the time.

9.1.1 Design Introspection

In VPI, every object is referred as a handle, which has the data type of vpiHandle defined in the header file. This is essentially a pointer to an object whose implementation is vendor-dependent. Therefore, we cannot make any assumptions about how the handle is created or managed except following the LRM rules, if we want our code to be portable to various simulators.

The data model for VPI is centered around the actual design hierarchy, originated from the very top module. As a result, there are essentially two kinds of relationship for VPI handles:

One-to-one. For instance, the module instance handle has one-to-one relationship to its module definition handle.
One-to-many. This is more common, for instance, a module instance typically has many port instances.

Depends on the relationship, the API call to query related objects is different. Although in most cases the relationship is intuitive, we highly recommend to check the LRM for correctness. Figure 13 shows an example of the data model for the instance and how you can traverse the object handles with different relationships and obtain other information.

Figure 13: Instance model diagram. Copied from LRM 37.10

The diagram packs a dense information and we need to read it in several aspects. First, we can take a look at how to obtain object handle information. There are three VPI functions that allows us introspect properties of an object handle, namely vpi_get, vpi_get64, and vpi_get_str. Their function definitions are shown below:

XXTERN PLI_INT32  vpi_get             PROTO_PARAMS((PLI_INT32 property,
                                                    vpiHandle object));
XXTERN PLI_INT64  vpi_get64           PROTO_PARAMS((PLI_INT32 property,
                                                    vpiHandle object));
XXTERN PLI_BYTE8 *vpi_get_str         PROTO_PARAMS((PLI_INT32 property,
                                                    vpiHandle object));

PLI_INT32 is a typedef defined in the header file, which implies that value has to be predefined as well. As a result, you should call vpi_get with a predefined property and you will result back with a predefined value as well. vpi_get_str is used to get string properties from the object handle, such as names. Figure 14 below shows a more detailed example of how they can be used with respect to the relation diagram.

Figure 14: Diagram key for accessing properties. Copied from LRM 37.4.2

Now let’s look at how to traverse the hierarchy based on different relations. If it is an one-to-one relationship, use vpi_handle, and if it’s one-to-many relationship, use vpi_iterate followed by vpi_scan. Figure 15 below shows how one should interpret the relationship arrows in the diagram. Notice that in the figure, the LRM distinguish between object and tag. This is because the VPI has a set of predefined object types and tags are not object types.

Figure 15: Diagram key for traversing relationships. Copied from LRM 37.4.3

Now let’s take a look at the instance diagram again now that we have covered the properties and relationship. The bottom left section describes properties associated with the instance object and their corresponding types, categorized by their functionalities. For instance, suppose we have a handle vh for a module instance, we can use vpi_get_str(vpiName, vh) to get the instance name, or vpi_get_str(vpiFullName, vh) for its full name.

To query its internal variable, we can use vpiReg as shown below:

vpiHandle iter, obj;
iter = vpi_iterate(vpiReg, vh);
while (obj = vpi_scan(iter)) {
    // do something with obj
}

Notice the backward arrow from the right to left, it means that we can query the parent with one-to-one relation using vpiInstance and vpiModule respectively.

Below is a more detailed example how we can print out the entire design hierarchy using VPI with breadth-first search. Notice that we’re using C++ instead of C since it is much easier to write high-level code with C++. Notice that by convention, if we query vpiModule with an nullptr, we will get the top module instance.

std::queue<vpiHandle> handle_queues;
handle_queues.emplace(nullptr);

while (!handle_queues.empty()) {
    auto *mod_handle = handle_queues.front();
    handle_queues.pop();
    auto *handle_iter = vpi_->vpi_iterate(vpiModule, mod_handle);
    // if the instance doesn't have any child instances, this will be nullptr
    if (!handle_iter) continue;
    vpiHandle child_handle;

    while ((child_handle = vpi_->vpi_scan(handle_iter)) != nullptr) {
        // get the definition name
        const auto *def_name = vpi_->vpi_get_str(vpiDefName, child_handle);
        // get the child instance handle's name
        auto const *hierarchy_name = vpi_->vpi_get_str(vpiFullName, child_handle);
        // print out the information
        printf("Instance name: %s (%s)\n", hierarchy_name, def_name);
        
        handle_queues.emplace(child_handle);
    }
}

VPI also provides a simple way to query object if we know its full/partial hierarchical name. To do so, simply do:

auto const *handle_full = vpi_handle(full_name, nullptr);
auto const *handle_scoped = vpi_handle(full_name, scope_handle);

By default, it the scope_handle is nullptr, we need to use the full hierarchy name. If the object is not found given the name and scope, nullptr will be returned.

To read out a signal values with the given handle as an integer, we can simply use vpi_get_value as follows:

s_vpi_value v;
v.format = vpiIntVal;
vpi_get_value(handle, &v);
int64_t result = v.value.integer;

Notice that s_vpi_value.value is a union and we need to access it based on how we specify the format when calling the function.

Notice that for many simulators, reading hierarchy or reading signal values require to turn on read access during compilation. This is typically done via providing -r switch in the command line arguments. Setting values, however, requires additional write permission, which is typically done via -w flag.

9.1.2 Callbacks

Callbacks are the most common way to interact with the simulator and inject custom logic to various places. As shown in Figure 12, there are various places we can insert callbacks, called PLI region for legacy reasons.

To add a call back, simply define the following struct:

static s_vpi_time time{vpiSimTime};
static s_vpi_value value{vpiIntVal};
// handle to place callback on
vpiHandle obj;
// callback types (called reason)
int reason;
// call back function
int (*cb_rtn)(p_cb_data);

s_cb_data cb_data{.reason = reason,
                  .cb_rtn = cb_rtn,
                  .obj = obj,
                  .time = &time,
                  .value = &value,
                  .user_data = user_data;

Notice that there are several fields we need to fill in. reason is redefined in vpi_user.h, such as cbValueChange and cbStartOfSimulation. Readers are encourage to check out vpi_user.h.

cb_rtn is the raw function pointer that takes a pointer to the callback object (s_sb_data) and returns an integer (typically 0). The callback can be used to identify the triggering object, allowing sharing the same callback function among different objects.

Some callbacks require a handle for triggering, such as cbValueChange, which triggers the callback when the value of the corresponding handle changes. These callbacks are typically related to simulation event and obj field are required to be filled in. In addition to obj, time.time and value.format should also be filled out, as shown in the example. This is because the simulator will fill out the value in the callback struct when invoking the callback function. Other types of callback should leave these fields as nullptr. Readers should check out Section 38.36.1 for more details.

user_data is a char* pointer that points to arbitrary memory location. Users are responsible to manage the life cycle and ensure that it is valid throughout the simulations.

Callbacks can be registered by calling vpi_register_cb, which returns a handle that can be used to un-register the callback. Once the callback is removed via vpi_remove_cb, the handle is no longer valid.

Below shows two examples of adding callbacks using vpi_register_cb

// callback function to print out signal's name and value
PLI_INT32 on_value_change_callback(p_cb_data cb_data) {
    // get current value
    auto value = cb_data->value->value.integer;
    // get signal's full name
    auto const *handle = cb_data->obj;
    auto const *name = vpi_get_str(vpiFullName, handle);
    printf("signal name: %s value: %lld\n", name, value);

    return 0;
}

// helper function to add callback to any given handle
vpiHandle add_on_value_change_callback(vpiHandle handle) {
    s_vpi_time time{vpiSimTime};
    s_vpi_value value{vpiIntVal};
    s_cb_data cb_data{.reason = cbValueChange,
                      .cb_rtn = &on_value_change_callback,
                      .obj = handle,
                      .time = &time,
                      .value = &value,
                      .user_data = nullptr;
    auto *r = vpi_register_cb(&cb_data);
    return r;
}

// callback function to add monitors on clock
PLI_INT32 on_sim_start_callback(p_cb_data cb_data) {
    // get clock signal. assume it's called
    // "clk" at the top
    printf("Simulation started"\n);

    auto *handle = vpi_handle_by_name("clk");
    // call our helper function to add callbacks
    // for simplicity we don't track the return handle
    // but in practice we need to keep it somewhere,
    // and remove it when not in use anymore
    add_on_value_change_callback(handle);

    return 0;
}


// helper function to add callback when simulation starts
vpiHandle add_on_sim_start_callback() {
    s_cb_data cb_data{.reason = cbStartOfSimulation,
                      .cb_rtn = &on_sim_start_callback,
                      .obj = nullptr,
                      .time = nullptr,
                      .value = nullptr,
                      .user_data = nullptr;
    auto *r = vpi_register_cb(&cb_data);
    return r;
}

In the example, we add a callback to the start of simulation (cbStartOfSimulation), which adds another callback that monitor the value the clock (cbValueChange). The monitor callback is generic and can print out any signal’s name and value, using the function argument.

Now that we have covered how to create callbacks, we need to know how to have the simulator actually call our helper functions. In VPI, it is done via a specific function table called vlog_startup_routines. It is essentially a null-terminated function pointer array, and must be defined with the exact name. If the implementation is done in C++, which is very likely, the array has to be wrapped inside the extern "C" block to avoid C++’s name mangling. The code below shows how to define the callback using the function we just covered.

extern "C" {
void (*vlog_startup_routines[])() = {add_on_sim_start_callback, nullptr};
}

In addition to that, there are very limited VPI functionality allowed in any function presented inside vlog_startup_routines. For our purpose, the only VPI function allowed to call is vpi_register_cb, and the only reasons we can use are the following:

cbEndOfCompile
cbStartOfSimulation
cbEndOfSimulation
cbUnresolvedSystf
cbError
cbPLIError

Users are required to put any other VPI functions in callbacks using cbStartOfSimulation, as we have done in the example.

9.1.3 Compile and Usage

To compile any VPI code, we can use any build tools and compiler, as long as we build a shared library with position-independent code flag on(-fPIC). Once we have the shared library, say libvpi.so, we need to provide it to the simulator. Different vendors have different set of flags to do so, and we will only cover usage for VCS and Xcelium.

VCS: -load libvpi.so. Typically you also need -debug_acc+all to allow VPI functions to introspect or change the design.
Xcelium: -loadvpi libsvi.so:add_on_sim_start_callback. Xcelium does not rely on vlog_startup_routines, so we need to provide the function name in the command line. -access +rw are recommend if you need to introspect or change the design.

9.2 Direct Programming Interface

Direct Programming Interface (DPI) is introduced in SystemVerilog to help bring the powerful C/C++ development environment to the simulation. It allows SystemVerilog code calls arbitrary C functions and invoke SystemVerilog functions from C, with some caveats of course. In addition, DPI ensures ABI compatibility and thus libraries compiled for one platform should work on any system under that particular platform. Since DPI is much easier to understand and program than VPI-based system tasks, we will focus it here.

9.2.1 Data types in DPI

Since DPI functions as a translation layer between standard C/C++ libraries and vendor-specific simulators, DPI specifies a type translation between native types in C/C++ and SystemVerilog. Table 6 shows the data type mapping.

Table 6: SystemVerilog and C data type conversion
SystemVerilog Type	C Type
`byte`	`char`
`shortint`	`short int`
`int`	`int`
`longint`	`long` `long`
`real`	`double`
`shortreal`	`float`
`chandle`	`void *`
`string`	`const char *`
`bit`	`unsigned char`
`logic`/`reg`	`unsigned char`

There are several things we need to pay extra attention. Notice that most of the SystemVerilog types are 2-state values. If 4-state value is provided, e.g. logic[3:0], simulator typically does type coercions to the type specified by the DPI function signature. For single bit value such as bit, by C convention is it treated as binary value in char.

SystemVerilog also allows open array as input type, which requires extra attention since it converts to custom C struct not defined by standard C/C++. C/C++ libraries that support open array has to be linked with the simulator. A mock array implementation is required if users want to link the library without a simulator. We will see examples of how to interact with open array later.

9.2.2 Context and Pure Functions

For performance reasons, SystemVerilog defines two additional keyword modifier to define DPI functions, i.e. context and pure. Keyword context is recommended if the function requires access to the DPI context, e.g. calling svSetScope. Prior to the DPI call marked with context, the simulator needs to set up the caller context scope properly, significantly slow down the simulation. As a result, we will not cover it in details here since most of the functionality can be achieved via C++ object-oriented programming exposed as DPI.

Keyword pure indicates that the function does not have any side-effects. Therefore, the compiler is free to rearrange the function call as it sees fit, which will slightly improve the simulation performance. Side-effects includes

Modifying global state, e.g. static values
Interacting with filesystem or networks
Modifying shared objects using multi-threading

If a function doesn’t have pure modifier, the execution order would be the same as normal SystemVerilog functions. This is the most command case.

9.2.3 Import functions from C/C++

To call a function from C/C++, we need to finish the following two tasks:

Define the function prototype in SystemVerilog
Make sure to export the function in C/C++ by using extern C (if C++) to avoid name mangling.

The syntax for DPI function is very similar to that of normal functions in SystemVerilog: we need to define the type of arguments and return type. The only difference is the keyword import followed by "DPI-C". Here are some examples for the DPI function definition.

// pure function that does a + b
import "DPI-C" pure function int add(input int a, input int b);
// sending a packet, with an open array of data
import "DPI-C" function int send_udp_packet(input string ip_address, input shortint unsigned port, input byte data[]);
// add but with output as argument
import "DPI-C" function void add_output(input int a, input int b, output int c);

Calling these functions are the same way as calling normal functions. As a result, we will focus on how to implement these functions from C/C++.

For the add function, we need the following C/C++ code. We use C++ here since it’s more commonly used and offers better object-oriented programming with chandle. If you want to use C instead, you can remove extern "C" and use a C compiler instead.

// add.cc
extern "C" {
// SystemVerilog definition
// import "DPI-C" pure function int add(input int a, input int b);
int add(int a, int b) {
    return a + b;
}
}

First we need to compile the code into a shared object, you can simply do g++ add.cc -shared -o add.so. Once we have the shared library, we can try to use it in our SystemVerilog code. Below shows an simple example of calling the add function.

// test.sv
import "DPI-C" pure function int add(input int a, input int b);

module top;
initial begin
    int a, b, c;
    a = 2;
    b = 4;
    c = add(a, b);
    $display("c is", c);
end
endmodule

We can run the example by invoking the Xcelium: xrun test.sv -sv_lib add.so, which will print out correct result.

For the second DPI function, we need to use the array structure defined in the SystemVerilog LRM. Readers can either copy the file directly from the LRM, or use the ones that shipped with the simulator. We will use the later, which is located at ${XCELIUM_HOME}/tools/include/.

Here is the code structure to read out a char array in C++ and corresponding function implementation for the DPI function.

// send_udp_packet.cc
// don't forget to include the header file since we're using array
#include "svdpi.h"
#include <vector>

std::vector<char> read_data(svOpenArrayHandle array) {
    // notice the argument type is svOpenArrayHandle
    std::vector<char>  result;
    // get loop bound
    auto low = svLeft(array, 1);
    auto high = svRight(array, 1);
    // get size and reserve the vector
    auto size = svSize(array, 1);
    result.reserve(size);

    for (auto i = low; i <= high; i++) {
        auto *value = reinterpret_cast<char*>(svGetArrElemPtr1(array, i));
        result.emplace_back(*value);
    }

    return result;
}


extern "C" {
// SYstemVerilog definition
// import "DPI-C" function int send_udp_packet(input string ip_address, input shortint unsigned port, input byte data[]);
int send_udp_packet(const char *ip_address, uint16_t unsigned port, svOpenArrayHandle data) {
    auto byte_data = read_data(data);

    // do something with the data, ip address, and port
    return 0;
}
}

Notice that svLeft, svRight, and svSize need dimension to compute the result. We use 1 here to indicate the first dimension. If it is a multi-dimension array, we can use higher numbers. By convention, lower and upper bound of the array is obtained via svLeft and svRight respectively. svGetArrElemPtr1 is used to get first dimension data. If the array is multi-dimension, svGetArrElemPtr# is used, where # corresponds to the dimension.

To compile, we need to tell the compiler to include DPI header files. Here is the example command

g++ send_udp_packet.cc -shared -o send_udp_packet.so -I${XCELIUM_HOME}/tools/include/

Below is the example of SystemVerilog that use array to call our function.

// test_send_udp_packet.sv
import "DPI-C" function int send_udp_packet(input string ip_address, input shortint unsigned port, input byte data[]);

module test_send_udp_packet;

initial begin
    byte array[];
    int res;

    array = new[4];
    for (int i = 0; i < array.size(); i++) begin
        array[i] = 42 + i;
    end

    res = send_udp_packet("127.0.0.1", 8888, array);    

end

endmodule

Notice that we pass in an open array which can be dynamically-sized. If we have a fixed size array, we can directly pass in the array variable thanks to automatic type conversion.

If the function has output arguments, we need to use pointers as the C/C++ function argument:

// SystemVerilog DPI definition
// import "DPI-C" function void add_output(input int a, input int b, output int c);
extern "C" {
void add_output(int a, int b, int *c) {
    *c = a + b;
}
}

We can simply call the function as usual, as shown below, which will print out c is 42.

import "DPI-C" function void add_output(input int a, input int b, output int c);

module top;

initial begin
    int a, b, c;
    a = 40;
    b = 2;
    add_output(a, b, c);
    $display("c is %0d\n", c);
end

endmodule

9.2.4 Output SystemVerilog Functions to C/C++

Although uncommonly used, SystemVerilog allows users to call SystemVerilog functions inside C/C++ as well. Most of the restrictions and rules for imported functions apply for exported functions as well. Below shows an example how to call a function defined in SystemVerilog.

extern "C" {
int add(int a, int b);

int get_value() {
    int a = 40, b = 2, c;
    c = add(a, b);
    return c;
}
}

export "DPI-C" function add;
import "DPI-C" context function int get_value();

function int add(input int a, input int b);
    return a + b;
endfunction

module top;

initial begin
    int c = get_value();
end

endmodule

Notice that exported has several differences from imported function:

The declaration syntax for exported function does not contain function prototype. It only needs a function identifier/name.
All exported functions are context function. As a result, if any imported function needs to call them, the imported function needs to be marked as context.

Because of the restriction of context function and the poor performance associated with the context functions, exported functions are rarely used unless necessary.

9.2.5 Object-Oriented Programming with DPI

With DPI, we can port most of the C++ code into SystemVerilog while maintaining the object-oriented interface (with some caveats which we will discuss later). The key is to use raw pointer type chandle, which holds our object pointer. We also need to create C bindings that convert C++ object interface to C interface, similar how old-fashion Python-binding is done. Below shows an example of how to port C++ object codes to SystemVerilog.

class Dog {
public:
    Dog(): distance_(0) {}
    void run(int distance) {
        distance_ += distance;
    }

    int distance() const { return distance_; }

private:
    int distance_;
};

// export function to C
extern "C" {
void *dog_ctor() {
    return new Dog();
}

void dog_dctor(void *dog) {
    auto *ptr = reinterpret_cast<Dog*>(dog);
    delete ptr;
}

void dog_run(void *dog, int distance) {
    auto *ptr = reinterpret_cast<Dog*>(dog);
    ptr->run(distance);
}

int dog_distance(void *dog) {
    auto *ptr = reinterpret_cast<Dog*>(dog);
    return ptr->distance();
}
}

Here is the DPI declaration and SystemVerilog Bindings

package dog;
import "DPI-C" function chandle dog_ctor();
import "DPI-C" function void dog_dctor(chandle dog); 
import "DPI-C" function void dog_run(chandle dog, int distance);
import "DPI-C" function int dog_distance(chandle dog);


class Dog;
    local static chandle handles[$];
    local chandle handle;

    function new();
        handle = dog_ctor();
        handles.push_back(handle);
    endfunction

    function void run(int distance);
        dog_run(handle, distance);
    endfunction


    function int distance();
        return dog_distance(handle);
    endfunction

    static function final_();
        foreach(handles[i]) begin
            dog_dctor(handles[i]);
        end
    endfunction

endclass
endpackage

We can test out our binding using the following test bench code:

module top;

import dog::*;

Dog dog;

initial begin
    dog = new();
    dog.run(2);
    dog.run(40);
    $display("distance: %d", dog.distance());
end

final begin
    Dog::final_();
end

endmodule

We should see 42 printed out. Notice that although SystemVerilog is objected-oriented language like C++, it has garbage collection. The simulator needs to clean up unused objects, instead of programmers. As a result, we need to manually call clean up methods to delete any object created from C++. There are several ways to do it, and the example above shows an approach that only clean up at the end of simulation. In this case we hold every objects created in SystemVerilog into an queue. Once we’re done with the simulation, we call the destructor to clean up the memory.

9.2.6 Performance Tips for DPI

Although DPI is very useful in RTL design and verification, it has some performance impact on simulation. Here are some tips on how to speed up simulation that heavily relies on DPI function calls:

Avoid using context DPI as much as possible. If a user context is required, try to pass in a raw chandle to obtain the context.
Batch up calls using an array. If the DPI is streaming in data to the backend, try to use batch the data up using arrays.
If the DPI function is complex but not required to control execution flow, try to implement it in a different thread. The main DPI threads returns immediately after dispatching the task. Notice that this requires proper concurrent programming techniques, but still worth the effort if the DPI performance is not expected.