ForEach Loop

 The Foreach Loop container defines a repeating control flow in a package. The loop implementation is similar to Foreach looping structure in programming languages. In a package, looping is enabled by using a Foreach enumerator. The Foreach Loop container repeats the control flow for each member of a specified enumerator.

SQL Server Integration Services provides the following enumerator types:

·        Foreach ADO enumerator to enumerate rows in tables. For example, you can get the rows in an ADO recordset.

The Recordset destination saves data in memory in a recordset that is stored in a package variable of Object data type. You typically use a Foreach Loop container with the Foreach ADO enumerator to process one row of the recordset at a time. The variable specified for the Foreach ADO enumerator must be of Object data type. For more information about the Recordset destination.

·        Foreach ADO.NET Schema Rowset enumerator to enumerate the schema information about a data source. For example, you can enumerate and get a list of the tables in the AdventureWorks2012 SQL Server database.

·        Foreach File enumerator to enumerate files in a folder. The enumerator can traverse subfolders. For example, you can read all the files that have the *.log file name extension in the Windows folder and its subfolders. Note that the order in which the files are retrieved cannot be specified.

·        Foreach From Variable enumerator to enumerate the enumerable object that a specified variable contains. The enumerable object can be an array, an ADO.NET DataTable, an Integration Services enumerator, and so on. For example, you can enumerate the values of an array that contains the name of servers.

·        Foreach Item enumerator to enumerate items that are collections. For example, you can enumerate the names of executables and working directories that an Execute Process task uses.

·        Foreach Nodelist enumerator to enumerate the result set of an XML Path Language (XPath) expression. For example, this expression enumerates and gets a list of all the authors in the classical period: /authors/author[@period='classical'].

·        Foreach SMO enumerator to enumerate SQL Server Management Objects (SMO) objects. For example, you can enumerate and get a list of the views in a SQL Server database.

·        Foreach HDFS File Enumerator to enumerate HDFS files in the specified HDFS location.

·        Foreach Azure Blob enumerator to enumerate blobs in a blob container in Azure Storage.

·        Foreach ADLS File enumerator to enumerate files in a directory in Azure Data Lake Store.

·        Foreach Data Lake Storage Gen2 File enumerator to enumerate files in a directory in Azure Data Lake Store Gen2.

The following diagram shows a Foreach Loop container that has a File System task. The Foreach loop uses the Foreach File enumerator, and the File System task is configured to copy a file. If the folder that the enumerator specifies contains four files, the loop repeats four times and copies four files.



Setting up the ForEach Loop Container

Our first step as all components in SSIS is to drag the Foreach Loop container from the Toolbox to the control flow design surface, as shown below.



This is the container once it is dragged.


You now have to configure your ForEach Loop. The different types of config has been discussed above.
I will be going through two types of configs, File Enumerator Config and the ADO Enumerator Config, these are the two most common used configs.

File Enumerator Config

In this config we set up a path and and file mask of a file(s) that we would like to loop through.



Under the "Folder" option, is where the file path is passed. In the "Files" option is where the file mask is used.
In the Files option, if the default is left "*.*", the ForEach loop will then go and pick up and try and process all the files in that path. Therefore it is very important to specify the "File Mask".

The "Retrieve File name" option is used to tell the ForEach Loop, how to identify the file.

Name and extension:- The ForEach Loop will look only at the name and the extension of the file. The extension is based on what we specify in the "Files" option.
 
Fully qualified:- With this option, the ForEach loop, will then concatenate the "Folder" + name of the file it finds + "Files" options, to identify the location, name and extension of the file.

Name Only:- Here the ForEach loop will only look at the name of the file.

My personal experience and option is that, I always use the "Fully qualified" option, as it works best for me, going forward we will be creating a few variables that would be translated into expressions with the forEach loop, which would justify my opinion.
 
Then we have a "Transverse subfolders" option, this option is used, when we want loop through multiple folders from the folder we specified in the "Folders" option to the end of a list of folders.
We do not specify the list, the loop will go through all possible folders which are available. This is used to avoid using multiple ForEach Loops for a single run for the same file type going to the same destination.

I will now create a variable and a parameter to be used within our ForEach loop config.
For more info on variables and parameters and how they work refer to : Variables and Parameters

I start off creating a parameter, why I create this as a parameter, is that it will be used by the environment variable. I am going to post an article later about environment variables and how to use them, but, the long and short of it is that, the environment variables are, variables, that are setup in the server side.




These variables have defaulted values and names, these variables are set up by the different environments, such as, Dev, Test and Prod, each environment has a different value, therefore, on deployment of the SSIS package, the environment variable name is used and based on the environment we deploying to, the value of that variable will be used.

That is some context of the environment variable and to get more understanding as to why I have created mine as I did below.

Based on my parameter I have created, I then create my variable to be used within my package.

 
 

You will notice that I have an expression within my variable. This is used by concatenating the parameter with the second part of the file path.


Once these have been created I then go ahead, with completing my ForEach loop config.

On the ForEach Loop Editor, next to expression box, click the "..." button to open dialog box as seen below.


On the left is the "Property", this indicates to SSIS what to assign the variable to, we go and select the "Property" first.


For this example we select "Directory", this will indicate to SSIS, to use the path in the variable.
Once the "Property" has been selected, we then go ahead and set the "Expression", which is on the right side of the dialog box as seen above. Once again we select the "..." button, to open another dialog box.


Take note that, I have selected "User:FilesFolder" variable, this is the concatenation, of the parameter with the end of the directory. Important to remember, keep all directory paths the same or similar in each environment, this will make using environment variables easier to use, also to remember, in each environment create a folder that identifies the different environment.

eg: C:\SourceFolder\Dev\
      C:\SourceFolder\Test\
      C:\SourceFolder\Prod\

Once we have selected the directory, we going to set the "File Mask", this is used to either identify the file extension or a "common" word in the file name.

eg. *file* -- This file mask with tell SSIS, to use a file with a filename that contains the word "file" in it.

Same as setting the directory in the "Property Expression Editor", we need to select a property and set and expression.

For the file mask, on the property we select the "Filespec" option.


I have created a variable with the file mask.

 

Notice that I have a default value within this variable, the value should not change, unless requirements have changed, in an environment, that we are aware that, requirements, like file types change regularly, instead of creating a variable, we then would set a parameter, this would make it easier to change after the package has been deployed, it is not recommended, to redeploy SSIS packages all the time, we rather cater for cases where we could make changes to a SQL job, of if the package has been deployed to the SSIS catalog, change it then at that level.

SQL jobs and SSIS catalog articles will be posted soon, with more details on how they work and what they are used for.

The final config on the ForEach loop container is, setting the variable mapping. I call this the "magic" variable, why I call this the "magic" variable, is that there is not value set in the variable. The SSIS package uses the previous configs, within the container, to get the value for this variable.
In this case, it will be populated by the "Filename" of the file. 


Once this, variable is created we would then need to use it in the ForEach loop container.

On the left of the ForEach loop container Editor, select, "Variable Mapping", once select you will notice the is again two columns, Variable and Index.
Variable is where we select the variable to be used, "Index" is the right most column, this indicates to SSIS the sequence in which the variables should be used, in this context we only using one variable.
You should notice that the "Index" starts at "0", in programming everything starts at "0" instead of "1", do not change the value to "1", or your package will fail.


We have now concluded in the configuration of the ForEach loop container, to follow is how do we now, use this container.

Comments