SQL Server Lookup Transformations: Cache Modes (Part 1/2)

In Microsoft SQL Server Integration Services 2012 (SSIS) Lookup Transformations can be used to integrate external data into your Data Flow. While most Lookup Transformations in SSIS are relatively simple, it is important to know how they work and which situations to use them in. In this entry, we will focus on the cache modes of this transformation. There are three modes of caching in the Lookup Transformation: Full, Partial, and None.

Full Cache

As the name implies, the full cache mode keeps a copy of the reference data set in memory and performs all of the lookups from there. This means that the entire reference dataset must be moved to memory before the lookup process begins. Moving the reference dataset to cache allows for lookups to be performed much faster, making it the optimal method of performing lookups in large data flows with a relatively small reference dataset.

Full Cache mode also allows you to use a reference dataset from a Cache Connection Manager instead of the standard OLE DB Connection. When selected, SSIS will load a given cache file (usually a .caw file) to memory to be used for the lookup. If you find yourself in a situation where you are performing the same lookup multiple times, this method can be an effective way to share these reference datasets, especially if they are large.

Partial Cache

In this mode, the lookup transform checks the database for a value. When the value is found, a copy of it is kept locally. This mode is particularly useful if your reference table is especially large, if there will likely be a pattern of similar values being searched for, or if you need to take advantage of the advanced functionality of the Lookup Transformation.

Partial Cache mode is the only way you can utilize the advanced functionality for lookups. In the advanced tab you can set options relating to the size of your cache. The two options for cache size on this tab allow you to specify the values of the cache in 32 bit and 64 bit environments. These attributes can be modified at runtime by changing the “MaxMemoryUsage” and “MaxMemoryUsage64” attributes in the transformation’s properties

The Partial Caching section of the advanced tab can also be used to for caching lookups that did not return a result. To do this, select “Enable cache for items with no matching entries.” You will then need to specify a percentage of cache that will be dedicated to holding “no match” entries. This is particularly useful when you expect for your lookup to not provide a result on values that you know will be repeated.

No Cache

In this mode, no cache is kept from each lookup operation. The SSIS runtime environment will perform a database lookup for every value that passes through the transformation and no record will be kept in memory. This mode is only useful if you have a volatile reference dataset or if you are absolutely certain that none of the rows of a very large reference table will be repeatedly referenced.

Continue reading part 2 of 2.