Pricing
This page describes the pricing for each Data Transformation Pipeline run in the Super Parallel Processing infrastructure.
Overview
A Data Transformation Pipeline is billed for the compute resources it uses. The measured compute resources are CPU and memory.
Although the rate for pricing is based on the hour, Super Parallel Processing usage is billed in per second increments, on a per pipeline basis. Usage is stated in hours in order to apply hourly pricing to second-by-second use. For example, 30 minutes is 0.5 hours. Workers and jobs might consume resources as described in the following sections.
Worker CPU and memory
Each Super Parallel Processing pipeline uses at least one worker node. The pipeline service provides two worker types: batch and streaming. Batch and streaming workers have separate service charges.
Pipeline workers consume the following resources, each billed on a per second basis:
- CPU
- Memory
You can override the default worker count for a job. If you enable autoscaling, you can specify the maximum number of workers to allocate to a pipeline. Workers and respective resources are added and removed automatically based on autoscaling actuation.
Pipeline compute resource pricing
The following table contains pricing details for worker resources.
Job type | CPU (per vCPU per hour) | Memory (per GB per hour) |
Batch | $0.1232 | $0.0085 |
Streaming | $0.1518 | $0.0085 |
- Batch worker defaults: 1 vCPU, 3.75 GB memory, 250 GB Persistent Disk.
- Streaming worker defaults: 4 vCPU, 15 GB memory, 400GB Persistent Disk per worker, with a minimum of two workers.
Pipeline storage resource pricing
Storage resources are billed at the same rate for streaming and batch pipelines. You can change the default disk size or disk type at any time.
Storage – Standard Persistent Disk (per GB per hour) | Storage – SSD Persistent Disk (per GB per hour) |
$0.0001296 | $0.0007152 |
The Pipeline service is currently limited to 15 persistent disks per worker instance when running a streaming job. Each persistent disk is local to an individual worker node. A 1:1 ratio between workers and disks is the minimum resource allotment.