Vector v0.56.0 release notes

The COSE team is excited to announce version 0.56.0!

Release highlights

  • Added a new databricks_zerobus sink that streams log data to Databricks Unity Catalog tables through the Zerobus ingestion service. The sink supports OAuth 2.0 authentication, automatic schema fetching from Unity Catalog, and protobuf batch encoding.
  • Added a new delay transform that delays each event by a fixed duration. Events can also be delayed based on a condition which includes VRL transforms.
  • HTTP-based sinks that use the shared retry helpers now support a retry_strategy configuration option to control which HTTP response codes are retried. The http sink also includes a new example showing how to retry only specific transient status codes.
  • The vector sink now supports zstd compression in addition to gzip. This provides better compression ratios and performance for Vector-to-Vector communication.
  • The tag_cardinality_limit transform received major enhancements: per-tag cardinality overrides (per_tag_limits), per-metric tracking isolation (tracking_scope: per_metric), a global key cap (max_tracked_keys), and the ability to opt entire metrics out of cardinality tracking.
  • Parquet batch encoding in the aws_s3 sink is now available out of the box in official release binaries for all users.
  • Fixed a CPU regression introduced in 0.50.0 affecting all sinks that use metric normalization such as prometheus_remote_write, aws_cloudwatch_metrics, statsd, and others.
  • Restored support for installing Vector on RHEL 8, Rocky Linux 8, AlmaLinux 8, and CentOS Stream 8, which had been broken since 0.55.0 due to an inadvertent glibc requirement bump.
  • Unit tests now support an optional expected_event_count field on test outputs, allowing assertions on the number of events emitted by a transform.

Breaking Changes

  • The greptimedb_metrics and greptimedb_logs sinks now require GreptimeDB v1.x. Users running GreptimeDB v0.x must upgrade their GreptimeDB instance before upgrading Vector.
Upgrading Vector
When upgrading, we recommend stepping through minor versions as these can each contain breaking changes while Vector is pre-1.0. These breaking changes are noted in their respective upgrade guides.

Vector Changelog

5 new features

  • HTTP-based sinks that use the shared retry helpers now support a retry_strategy configuration option to control which HTTP response codes are retried. The http sink also includes a new example showing how to retry only specific transient status codes.

    Issue: https://github.com/vectordotdev/vector/issues/10870


    Thanks to ndrsg for contributing this change!
  • Unit tests now support an optional expected_event_count field on test outputs, allowing assertions on the number of events emitted by a transform.
    Thanks to pront for contributing this change!
  • Added a new databricks_zerobus sink that streams log data to Databricks Unity Catalog tables through the Zerobus ingestion service. The sink supports OAuth 2.0 authentication, automatic schema fetching from Unity Catalog, and protobuf batch encoding.
    Thanks to flaviocruz for contributing this change!
  • Added a new delay transform that delays each event by a fixed duration.
    Thanks to esensar, Quad9DNS for contributing this change!
  • Added ratio_field and rate_field options to the sample transform to support dynamic per-event sampling, while requiring static rate or ratio fallback configuration and disallowing ratio_field and rate_field together.
    Thanks to jhammer for contributing this change!

10 enhancements

  • HTTP-based sinks using the shared retry logic now classify transport-layer failures with HttpError::is_retriable: connection and TLS connector issues may be retried, while failures such as invalid HTTP request construction or an invalid proxy URI are not. Setting retry_strategy to none disables retries for these transport errors and for request timeouts, in addition to status-code-based retries.

    Issue: https://github.com/vectordotdev/vector/issues/10870


    Thanks to ndrsg for contributing this change!
  • The vector sink now supports zstd compression in addition to gzip. This provides better compression ratios and performance for Vector-to-Vector communication.

    The compression configuration has been enhanced to support multiple algorithms while maintaining full backward compatibility:

    Legacy boolean syntax (still supported)

    sinks:
      my_vector:
        type: vector
        address: "localhost:6000"
        compression: true   # Uses gzip (default)
        # or
        compression: false  # No compression
    

    New string syntax

    sinks:
      my_vector:
        type: vector
        address: "localhost:6000"
        compression: "zstd"  # Use zstd compression
        # Supported values: "none", "gzip", "zstd"
    

    The Vector source automatically accepts both gzip and zstd compressed data, enabling seamless communication between Vector instances using different compression algorithms.


    Thanks to jpds for contributing this change!
  • The opentelemetry source’s gRPC OTLP receiver now accepts zstd-compressed requests in addition to gzip, matching the compression schemes advertised via the grpc-accept-encoding response header. No configuration change is required; clients can send OTLP payloads with grpc-encoding: zstd and they will be transparently decompressed.
    Thanks to jpds for contributing this change!
  • The custom auth strategy for the http_server source now supports event enrichment via metadata writes. VRL programs can write %field = value during authentication; those values are injected into every successfully authenticated event. The event body (.field) remains read-only. Existing custom programs that do not write metadata are unaffected.
    Thanks to 20agbekodo for contributing this change!
  • Bumped serde_json to 1.0.149 and serde_with to 3.18.0. serde_json switched its float-to-string formatter from Ryū to Żmij in 1.0.147, so floats serialized via the native_json codec may render with slightly different textual form (for example 1e+16 instead of 1e16). The change is purely cosmetic: parsed f32/f64 values round-trip identically, and Vector-to-Vector communication between old and new versions is unaffected.
    Thanks to pront for contributing this change!
  • The splunk_hec source now accepts optional per-endpoint codec configuration via event: { framing, decoding } and raw: { framing, decoding }. When decoding is set on an endpoint, Vector applies a second decoding pass after the HEC envelope is parsed: on /services/collector/event the envelope’s event field is fed through the codec, and on /services/collector/raw the request body is fed through the codec directly. A single payload can fan out to multiple events.

    For example, to decode JSON payloads in /event requests while splitting /raw bodies on newlines:

    sources:
      hec:
        type: splunk_hec
        address: 0.0.0.0:8088
        event:
          decoding:
            codec: json
        raw:
          framing:
            method: newline_delimited
          decoding:
            codec: bytes
    

    Thanks to thomasqueirozb for contributing this change!
  • The tag_cardinality_limit transform now accepts a top-level per_tag_limits map, mirroring the per-metric one: mode: limit_override to set a per-tag cap, or mode: excluded to bypass cardinality tracking for that tag on every metric without a per_metric_limits entry.
    Thanks to kaarolch for contributing this change!
  • Reduced the memory usage of the tag_cardinality_limit transform when running in exact mode by allocating less unused memory on initialization.
    Thanks to ArunPiduguDD for contributing this change!
  • The tag_cardinality_limit transform gained two new configuration capabilities:

    • Per-tag overrides (per_tag_limits): configure cardinality limits per tag key within a metric, or exclude individual tags from tracking.
    • Metric exclusion: opt entire metrics out of cardinality tracking via mode: excluded in per_metric_limits.

    Thanks to ArunPiduguDD for contributing this change!
  • The tag_cardinality_limit transform gained two new settings:

    • tracking_scope: isolate tag tracking per metric (per_metric) instead of sharing a single bucket across all metrics (global, the default).
    • max_tracked_keys: cap the total number of tag keys tracked to bound memory usage.

    Thanks to ArunPiduguDD for contributing this change!

20 bug fixes

  • The default /etc/vector/vector.yaml config file is no longer installed by the Debian, RPM, Alpine, and distroless-static Docker packages. The previous default ran a demo_logs source and printed synthesized syslog lines to stdout, which then surfaced in journald or /var/log/ on hosts running Vector as a service and was a common source of confusion.

    New installs will now have no active config on disk. Provide your own configuration at /etc/vector/vector.yaml (or pass --config <path>) before starting Vector. A reference example is shipped at /usr/share/vector/examples/vector.yaml, and more sample configs remain at /etc/vector/examples/.

    Existing installs are unaffected on upgrade: package managers preserve the on-disk /etc/vector/vector.yaml if you already had one.


    Thanks to pront for contributing this change!
  • Fixed a CPU regression introduced in 0.50.0 affecting all sinks that use metric normalization such as prometheus_remote_write, aws_cloudwatch_metrics, statsd, and others.

    The only exception is the incremental_to_absolute transform when max_bytes or max_events are configured, where the overhead is expected and necessary for eviction to work correctly.


    Thanks to thomasqueirozb for contributing this change!
  • The shared gRPC decompression layer now rejects request frames that set the compressed flag without a negotiated grpc-encoding (e.g. identity or a missing header). Previously, such malformed frames were silently decoded as gzip, which could mask client/server compression-negotiation bugs.
    Thanks to jpds for contributing this change!
  • Fixed an issue during in-place reload of a sink with a disk buffer configured, where the component would stall for batch.timeout_sec before gracefully reloading. This fix also resolves an issue where Vector ignored SIGINT when a pipeline stall occurred.
    Thanks to graphcareful for contributing this change!
  • The windows_event_log source no longer freezes after periods of inactivity.
    Thanks to tot19 for contributing this change!
  • Sinks using batch encoding (Parquet, Arrow IPC) now consistently emit ComponentEventsDropped for every encode failure path. Previously some build_record_batch failures (notably type mismatches) dropped events silently. A new EncoderRecordBatchError internal event also reports component_errors_total with error_code="arrow_json_decode" or "arrow_record_batch_creation" at stage="sending" for granular alerting.
    Thanks to pront for contributing this change!
  • The error log + metric that the splunk_hec source emits on missing or invalid auth headers now specifies “authentication_failed” as the error_type.
    Thanks to 20agbekodo for contributing this change!
  • Restored support for installing Vector on RHEL 8, Rocky Linux 8, AlmaLinux 8, and CentOS Stream 8, which had been broken since v0.55.0 due to an inadvertent glibc requirement bump.
    Thanks to pront for contributing this change!
  • Restored the full VRL stdlib, including get_env_var, in the standalone VRL CLI and web playground by default.
    Thanks to pront for contributing this change!
  • Parquet encoding in the aws_s3 sink (batch_encoding) now works out of the box in the official release binaries. Previously it required compiling Vector from source with the codecs-parquet feature.
    Thanks to pront for contributing this change!
  • The windows_event_log source now adds standard source metadata, including source_type, to emitted log events.
    Thanks to tot19 for contributing this change!
  • Fixed a bug in the file source where checkpoints recording the last-read file position were not always fully written before Vector shut down. On the next startup, the file source could start reading from an earlier position, causing events to be re-processed.
    Thanks to thomasqueirozb for contributing this change!
  • The aggregate transform now correctly passes through or ignores metrics whose kind is not supported by the configured mode. Prior to this change, these metrics would be silently dropped, contrary to the officially documented behavior. For example, absolute metrics flowing through a sum-mode aggregate transform are now forwarded to the next step in the pipeline unchanged rather than being dropped:

    {kind: incremental, type: counter, name: "http.requests", value: 10}  → summed into aggregate
    {kind: absolute,    type: gauge,   name: "cpu.usage",     value: 0.83} → previously dropped, now passes through unchanged
    {kind: incremental, type: counter, name: "http.requests", value: 5}   → summed into aggregate
    

    If you want to preserve the previous drop behavior, add a filter transform before the aggregate transform to discard the unwanted metric kind.


    Thanks to ArunPiduguDD for contributing this change!
  • The aws_s3 and clickhouse sinks now correctly advertise only the batch_encoding.codec values they actually support: parquet for aws_s3 and arrow_stream for clickhouse. Previously, the documentation and configuration schema listed both codecs for both sinks, even though picking the wrong one produced a startup error.
    Thanks to flaviofcruz for contributing this change!
  • Fixed a crash that could occur when a source or transform emitted an empty event batch into a topology with downstream buffers. Vector now drops empty batches before they reach those buffers and logs a warning identifying the upstream component.
    Thanks to graphcareful for contributing this change!
  • The text content generated by the demo_logs source has changed: the pool of fake usernames and the pool of fake domain TLDs are now both defined inside Vector rather than pulled from an external crate. The line formats (apache_common, apache_error, json, syslog, bsd_syslog) are unchanged. If any of your tests or downstream pipelines assert on specific generated usernames or TLDs, update those expectations.
    Thanks to pront for contributing this change!
  • Fixed a bug in the topology builder causing component metrics registered at build time to miss the component tags if the component build function awaits non-trivially.

    This notably affected sinks using a disk buffer, and sources or sinks performing IO work in the build function.


    Thanks to gwenaskell for contributing this change!
  • Fixed a bug in the mqtt source where user-provided TLS client certificates (crt_file / key_file) were being silently ignored, breaking mTLS connections to strict brokers like AWS IoT Core.
    Thanks to mr- for contributing this change!
  • Redact sink-specific API key headers (DD-API-KEY, X-Honeycomb-Team, x-api-key, Api-Key) in debug-level HTTP request and response logs, alongside the existing standard headers (Authorization, Proxy-Authorization, Proxy-Authenticate, WWW-Authenticate, Cookie, Set-Cookie, Cookie2).
    Thanks to pront for contributing this change!
  • TCP-based sources that emit acknowledgements (fluent, logstash) no longer log a spurious Error writing acknowledgement, dropping connection. at ERROR level when the ack write fails because the peer cleanly closed its TLS session (for example, during a rolling pod restart). These graceful shutdowns now log at WARN and no longer increment component_errors_total{error_code="ack_failed", ...}, preventing operator dashboards/alerts from firing on routine peer disconnects. Genuine ack write failures are still logged at ERROR and continue to increment component_errors_total.

    The connection_shutdown_total{mode="tcp"} counter is now incremented exactly once per accepted source connection when its per-connection task exits, pairing with ConnectionOpen — regardless of cause (TLS handshake failure, shutdown signal during handshake, graceful peer EOF, decoder failure, downstream closed, ack write failure, tripwire, max connection duration). Previously it was not emitted by TCP sources at all.


    Thanks to taylorchandleryoung for contributing this change!

1 chore

  • The greptimedb_metrics and greptimedb_logs sinks now require GreptimeDB v1.x. Users running GreptimeDB v0.x must upgrade their GreptimeDB instance before upgrading Vector.
    Thanks to thomasqueirozb for contributing this change!

VRL Changelog

[0.33.1 (2026-06-02)]

Fixes

  • Reverted parse_regex changes from 0.33.0 which introduced a performance regression in multi-threaded scenarios.

(https://github.com/vectordotdev/vrl/pull/1789)

[0.33.0 (2026-05-28)]

New Features

  • VRL string literals now support \u{HEX} Unicode escape sequences. Any valid Unicode scalar value can be expressed, e.g. "hello\u{1F30E}world". Invalid sequences (empty braces, non-hex digits, surrogate codepoints, or values above U+10FFFF) are reported as a compile-time error.

(https://github.com/vectordotdev/vrl/pull/1771)

  • parse_regex now accepts dynamic regex patterns (variables and runtime expressions), consistent with parse_regex_all. When the pattern is a literal, return type information remains precise based on named capture groups.

(https://github.com/vectordotdev/vrl/pull/1774)

Enhancements

  • Updated user agent data for parse_user_agent function

(https://github.com/vectordotdev/vrl/pull/1776)

  • Protobuf encoding now coerces compatible scalar types into the target field type: integers and strings are accepted for bool fields (using the same parsing as to_bool), and integers are accepted for float/double fields. Previously these inputs failed encoding and required explicit conversion in VRL.

(https://github.com/vectordotdev/vrl/pull/1763)

  • Added an optional allow_lossy_string_coercion argument to encode_proto. VRL’s protobuf encoding stringifies Boolean, Integer, Float, and Timestamp values when assigned to a protobuf string field as a convenience for callers handling loosely typed input. The protobuf JSON mapping only accepts a JSON string for a string field, so callers who want strict spec-compliant encoding can now pass allow_lossy_string_coercion: false. The default stays true, so today’s behavior is unchanged.

(https://github.com/vectordotdev/vrl/pull/1764)

  • Improved performance of parse_regex/parse_regex_all by pre-computing capture group names and indices at compile time. Users may see anywhere from 4% to 13% speedups in some cases.

(https://github.com/vectordotdev/vrl/pull/1773)

  • Improved performance of parse_regex_all by reusing the compiled regex across invocations.

(https://github.com/vectordotdev/vrl/pull/1775)

Fixes

  • The compiler now reports every unhandled-error in a single compilation pass instead of stopping at the first one. For example:
{
push(.x, 1)
.b = push(.y, 2)
}

now reports both push(.x, 1) (unhandled error) and .b = push(.y, 2) (unhandled fallible assignment) in one go. Previously you’d only see the second one, fix it, recompile, and only then discover the first.

(https://github.com/vectordotdev/vrl/pull/1759)

  • Fixed a confusing compile error where a fallible call earlier in a block could cause a later, unrelated assignment to be reported as the problem. For example:
{
.a = 1
push(.x, 1)        # the unhandled error is actually here
.b = 2             # but the compiler used to flag this line
}

The error is now reported on the actual fallible expression, so adding ! or the , err = form fixes it where you’d expect. This also fixes the same shape inside closure bodies, e.g. inside for_each/map_values.

(https://github.com/vectordotdev/vrl/pull/453)

  • Fixed a false positive in the unused-variable diagnostic (E900) where a variable used before being reassigned (shadowed) was incorrectly flagged as unused at its original assignment.

(https://github.com/vectordotdev/vrl/pull/1743)

  • encode_proto and parse_proto now support proto maps whose keys are integers or booleans, not just strings. Because VRL object keys are always strings, integer and boolean keys are written in their string form:
encode_proto({ "by_id": { "42": "alice" } }, "schema.desc", "MyMessage")

Previously parse_proto errored on these maps and encode_proto silently dropped the field. Note that encode_proto will now return an error if a key string can’t be parsed into the schema’s key type (for example, "abc" against a map<int32, ...>).

(https://github.com/vectordotdev/vrl/pull/1762)

  • Fixed a typo in enum variant that made it impossible to use SCREAMING_SNAKE in casing functions such as pascalcase, camelcase and others.

pascalcase("hello", original_case: "SCREAMING_SNAKE") now compiles properly.

(https://github.com/vectordotdev/vrl/pull/1770)

  • Allowed the else keyword (and else if) to appear on a new line after the closing } of an if-block. Previously the trailing newline terminated the if-expression at the parser level, forcing else to share a line with }.

authors: pront

(https://github.com/vectordotdev/vrl/pull/1756)

Download Version 0.56.0

macOS
tar.gz
Windows
zip
Windows (MSI)
msi