Transform¶
This example shows how handling apache logs with a tremor and elastic search could work. The example is a lot more complex than the initial showcases and combines three components.
Kibana, which once started with docker-compose can be reached locally. It allows browsing through the logs. If you have never used Kibana before you can get started by clicking on Management then in the Elasticsearch section on Index Management.
Elastic Search, which stores the logs submitted.
Tremor, which takes the apache logs, parses and classifies them then submits them to indexes in elastic search.
In addition the file demo/data/apache_access_logs.xz
Link is used as example payload.
Environment¶
In the example.trickle
we define scripts that extract
and categorize
apache logs. Any log that is not conforming ther predefined format will be dropped. All other configuration is the same as per the previous example and is elided here for brevity.
Business Logic¶
define script extract # define the script that parses our apache logs
script
match {"raw": event} of # we use the dissect extractor to parse the apache log
case r = %{ raw ~= dissect|%{ip} %{} %{} [%{timestamp}] "%{method} %{path} %{proto}" %{code:int} %{cost:int}\\n| }
=> r.raw # this first case is hit if the log includes an execution time (cost) for the request
case r = %{ raw ~= dissect|%{ip} %{} %{} [%{timestamp}] "%{method} %{path} %{proto}" %{code:int} %{}\\n| }
=> r.raw
default => emit => "bad"
end
end;
define script categorize # define the script that classifies the logs
with
user_error_index = "errors", # we use "with" here to default some configuration for
server_error_index = "errors", # the script, we could then re-use this script in multiple
ok_index = "requests", # places with different indexes
other_index = "requests"
script
let $doc_type = "log"; # doc_type is used by the offramp, the $ denotes this is stored in event metadata
let $index = match event of
case e = %{present code} when e.code >= 200 and e.code < 400 # for http codes between 200 and 400 (exclusive) - those are success codes
=> args.ok_index
case e = %{present code} when e.code >= 400 and e.code < 500 # 400 to 500 (exclusive) are client side errors
=> args.user_error_index
case e = %{present code} when e.code >= 500 and e.code < 600
=> args.server_error_index # 500 to 500 (exclusive) are server side errors
default => args.other_index # if we get any other code we just use a default index
end;
event # emit the event with it's new metadata
end;
Command line testing during logic development¶
$ docker-compose up
... lots of logs ...
Inject test messages via websocat
Note
Can be installed via cargo install websocat
for the lazy/impatient amongst us
$ xzcat logs.xz | websocat ws://localhost:4242
...
Open the Kibana index management and create indexes to view the data.
Discussion¶
This is a fairly complex example that combines everything we've seen in the prior examples and a bit more. It should serve as a starting point of how to use tremor to ingest, process, filter and classify data with tremor into an upstream system.
Tip
When using this as a baseline be aware that around things like batching tuning will be involved to make the numbers fit with the infrastructure it is pointed at. Also since it is not an ongoing data stream we omitted backpressure or classification based rate limiting from the example.