Skip to main content

Shared Resources Deployment

This section provides an overview of deploying the shared resources for an Ensono Stacks Data Platform.

The shared resources include Azure Data Factory resources which are shared across pipelines. These are as follows:

  • Linked services
    • ls_ADLS_DataLake - Connection to the Data Lake
    • ls_Blob_ConfigStore - Connection to the config storage location
    • ls_Databricks_Small - Connection to Databricks job cluster (default 2 fixed workers)
    • ls_KeyVault - Connection to Azure Key Vault
  • Datasets
    • ds_dp_ConfigStore_Json - For reading JSON data from ls_Blob_ConfigStore
    • ds_dp_DataLake_Parquet - For writing Parquet data to ls_ADLS_DataLake
  • Pipelines
    • pipeline_Get_Ingest_Config - To retrieve config data for use in a pipeline
    • pipeline_Generate_Ingest_Query - To generate a query for ingesting data

For details of how these resources are used in ingest pipelines, see data ingestion.

This guide assumes the following are in place:

Step 1: Create feature branch

Open the project locally and create a new feature branch, e.g.:

git checkout -b feat/de-shared-pipeline

The de_build folder includes YAML file called job-pipeline-vars that contains the variables used in the DE shared resource pipeline. These variables must be updated as per the project requirements.

Step 2: Add a shared resources pipeline in Azure DevOps

The default shared resources for the Ensono Stacks Data Platform are found under de_workloads/shared_resources. This directory contains a YAML file de-shared-resources.yml containing a template Azure DevOps CI/CD pipeline for building and deploying the shared resources. This YAML file should be added as the definition for a new pipeline in Azure DevOps.

  1. Sign-in to your Azure DevOps organization and go to your project.
  2. Go to Pipelines, and then select New pipeline.
  3. Name the new pipeline, e.g. de-shared-resources.
  4. For the pipeline definition, specify the YAML file in the project repository feature branch (de-shared-resources.yml) and save.
  5. The new pipeline will require access to any Azure DevOps pipeline variable groups specified in the pipeline YAML. Under each variable group, go to 'Pipeline permissions' and add the pipeline.

Step 3: Deploy shared resources in non-production environment

Run the pipeline configured in Step 2 to commence the build and deployment process.

Running this pipeline in Azure DevOps will initiate the deployment of artifacts into the non-production (nonprod) environment. It's important to monitor the progress of this deployment to ensure its success. You can track the progress and status of the deployment within the Pipelines section of Azure DevOps.

If successful, the core DE shared resources will now be available in the non-production environment. To view the deployed resources, navigate to the relevant resource group in the Azure portal. The deployed Data Factory resources can be viewed through the Data Factory UI.

Updating Data Factory resources

The structure of the data platform and Data Factory resources are defined in the project's code repository, and deployed through the Azure DevOps pipelines. Changes to Data Factory resources directly through the UI will lead to them being overwritten when deployment pipelines are next run. See Data Factory development quickstart for further information on updating Data Factory resources.

Step 4: Deploy shared resources in further environments

By default Ensono Stacks provides a framework for managing the platform across two environments - nonprod and prod. The template CI/CD pipelines provided are based upon these two platform environments, but these may be amended depending upon the specific requirements of your project and organisation.

  • Deployment to the non-production (nonprod) environment is triggered on a feature branch when a pull request is open
  • Deployment to the production (prod) environment is triggered on merging to the main branch, followed by manual approval of the release step.

Next steps

Once the shared resources are deployed you may now generate a new data ingest pipeline (optionally implementing the example data source beforehand).