Differences between revisions 40 and 41
Revision 40 as of 2018-02-11 22:44:52
Size: 20986
Editor: PaulOfford
Comment:
Revision 41 as of 2018-02-11 22:49:45
Size: 20982
Editor: PaulOfford
Comment:
Deletions are marked like this. Additions are marked like this.
Line 5: Line 5:
[[attachment:trb_screenshot.png|{{attachment:trb_screenshot.png||width=900}}]]
Line 6: Line 8:

[[attachment:trb_screenshot.png | This screenshot]] shows the dissector in action.

Introduction

I have a project to add support for two new block types to Wireshark. This doesn't seem to be documented anywhere and so I'm hoping that my notes here may help someone in the future.

attachment:trb_screenshot.png

This is work in progress and so the notes here are not complete. Also, I'm using this as a notepad and I may make mistakes which I'll correct later. If you notice mistakes, please feel free to update this page.

PCAP-NG Block Basics

A PCAP-NG file contains blocks of data. Each block contains:

  • Block Type - a 32-bit unsigned integer value identifying the type of block
  • Block Length - a 32-bit unsigned integer that is set to the length of the block, including the Block Type field, both Block Length fields and the block content
  • Block Content - or payload
  • Block Length - a repeat of the earlier Block Length value

Block Types are grouped as follows:

  • Standard Block Types with native support in Wireshark
  • Standard Block Types recognised in Wireshark but with support added via a plugin
  • Custom Block which is not yet supported by Wireshark
  • Local Use Block Types that are locally managed by a developer and not guaranteed to be unique

I have requested two Standard Block Types for my project - see Appendix C below. A recent code change (25583) prevents us from using anything other than Local Use Block during development.

Within the groups of Block Types, there is a further sub-grouping:

  • Record Blocks - blocks that generate an entry in the Wireshark Packet List
  • Internal Blocks - blocks that don't generate a Packet List entry

Objective

The objective of the project is to add Wireshark support for the display, filtering, etc. of text log data (machine data). The data is presented to Wireshark in a PCAP-NG file that contains two new block types:

  • TSDB - Text Source Descriptor Block that defines the layout of the data records
    • The data in the TSDB is used to define heading fields i.e. the heading fields aren't predefined as they typically are in dissectors, but rather defined at file load time (and cleared when the file is closed)
    • This block is analogous to the Interface Descriptor Block found in a network packet capture
  • TRB - Text Record Block that contains the log record data

See Appendix D below for the block formats.

The initial data being used is Apache HTTPD Common format log records, but I'm designing the solution so that any format of log data can be supported. I've started with the Apache HTTPD log data as it is a fairly simple format; space separated variables in fixed columns.

Test PCAP-NG Generation

Of course, the above raises the question, "What creates the PCAP-NG file with the new blocks?". At this time I'm using the Babel function that comes with TribeLab Workbench. The project that should follow this one will be to write a Wiretap reader for log files.

Babel produces the PCAP-NG file like this:

    log_file -----------------------------------> TRBs
                           ^
                           |
    apache-common.xml -----+--------------------> TSDB

An XML file describes the format of the log file. The XML is used to generate the TSDB, and some elements of it are used to help parse the log records to form TRBs.

  • See Appendix A below for an example of the XML file
  • See Appendix B below for a sample PCAP-NG file

NB: Although I'm using Babel to generate the file, anyone can use any tool to generate a suitable file. There is nothing proprietary about the TSDB or TRB formats.

Other points

I'm trying to add this support completely through the plugin framework, and avoid having to make any changes to core Wireshark code. There is an API to add support for new block types via plugins, but I think this may the first project to use this functionality; there could be bugs and it may not be complete.

Even though the code I am writing has nothing to do with network packets, Wireshark still refers to the list of events in the top pane as the Packet List, and various structures that we need to use refer to all events as packets, most notably the wtap_pkthdr structure.

TSDB Handling

The TSDB defines the type and meaning of fields. Wireshark should not generate a "Packet List" entry for this block. Later; this type of block is referred to as an internal block. If you want to add support for a new internal block to your dissector, think of the TSDB as a template.

The TSDB defines each field through TLVs (type-length-value). The types map to native Wireshark field types with two important exceptions.

Field Type Encoding

The encoded integer values for the field types are not the same as the integer values used within Wireshark. This is because the Wireshark types are generated via an enumerated list. A change to the list could change the enumerated values. If we used these values within the TSDB, we would have compatibility problems. Wireshark field tyoe values start at 0. The TSDB field type values start at 1001.

The Wireshark field types values can be found in epan/ftypes/ftype.h. The mapping of TSDB values to Wireshark field values is in the array babeltowsft.

Special Cases

We need to deal with two special cases. A log record could contain many date-time values. We need to indicate which value should be used in the Wireshark packet list. This is done through the EVENT_DATETIME field type.

A log often mixes IPv4 and IPv6 addresses in the same column; both Apache HTTPD and Microsoft IIS do this. To accommodate this we have a TS_FT_IPvx field type.

Internal Block

This type of block carries information but doesn't generate an entry in the packet list; similar to the Interface Descriptor Block (IDB). This is the simplest kind of block to handle. There are two steps:

  • Define a function that processes the block data - referred to as read block function
  • Register the read block function with Wireshark

It looks something like this:

gboolean tsdb_read_block(FILE_T fh, guint32 block_data_len, gboolean c, wtapng_block_t *wtapng_block,
    int *err, gchar **err_info)
{
    /* Use i as a general purpose index */
    size_t i;

    /* Signal that this isn't a record block */
    wtapng_block->internal = TRUE;

    /*
    * Is the size of this block reasonable for a TSDB?
    */
    if (block_data_len == 0 || block_data_len > wtapng_block->frame_buffer->allocated) {
        /* Not looking good. */
        *err = WTAP_ERR_BAD_FILE;
        *err_info = wmem_strdup_printf(wmem_file_scope(), "tsdb_read_block: block data length of %u is invalid",
            block_data_len);
        return FALSE;
    }

    /* read block content */
    if (!wtap_read_bytes(fh, wtapng_block->frame_buffer->data, block_data_len, err, err_info)) {
        wmem_strdup_printf(wmem_file_scope(), "tsdb_read_block: failed to read TSDB");
        return FALSE;
    }

    .
    .
    Do more stuff
    .
    .

    return TRUE;
}

void
proto_register_tsd(void)
{
    .
    .
    Do stuff
    .
    .
    register_pcapng_block_type_handler(BLOCK_TYPE_TSDB, tsdb_read_block, NULL);
}

See Appendix B for sample code.

Record Block

Introduction

Within the Wireshark code, any block that generates an entry in the packet list is referred to as a record block although historically they have also been called packet block and some code comments still use this term.

Encapsulation

A record block encapsulates payload in a "frame" format. The encapsulation is specified at three levels:

  • File level - used to populate the file_encap field of a wth structure
  • Interface level - used to populate the wtap_encap field of a wtapng_if_descr_mandatory_t structure
  • Per Packet - used to populate the pkt_encap field of a wtap_pkthdr structure

There are 16 available user values (WTAP_ENCAP_USER0 to WTAP_ENCAP_USER15) and I use WTAP_ENCAP_USER11 for the TRBs.

Timestamp Precision

The timestamp precision can be specified at three levels:

  • File level - used to populate the file_tsprec field of a wth structure
  • Interface level - used to populate the tsprecision field of a wtapng_if_descr_mandatory_t structure
  • Per Packet - used to populate the pkt_tsprec field of a wtap_pkthdr structure

wtapng_block structure and substructures

In the tsdb_read_block(...) function above, I don't populate any of the fields in the wtapng_block structure. That's OK for a silent block but we need Wireshark to create a dissection chain (?) for the TRBs, i.e. a block that we wish to dissect.

What we need is for the content of the block to be treated as a new protocol, in the same way as the content of an EPB is becomes a Frame.

The wtapng_block structure and substructure fields I believe have to be completed by a block read function are:

  • gboolean internal - indicates an internal block type; set in the TSDB and not in the TRB
  • nstime_t ts - the timestamp for the event you wish to be displayed in the packet list
  • guint32 caplen - the length of the payload in your block
  • guint32 len - the length of the original data - unless you are going to get very flash, this will be the same as the caplen
  • int pkt_encap - this should be set to a value from the WTAP_ENCAP_xxxx list of definitions in wiretap/wtap.h - WTAP_ENCAP_USER11 in the case of the TRB
  • int pkt_tsprec - this should be set to a value from the WTAP_TSPREC_xxxx list of definitions in wiretap/wtap.h - WTAP_TSPREC_USEC in the case of the TRB
  • presence_flags - these indicate the inclusion of certain field values in the pkt_hdr; values are:
    • WTAP_HAS_TS /**< time stamp */

    • WTAP_HAS_CAP_LEN /**< captured length separate from on-the-network length */

    • WTAP_HAS_INTERFACE_ID /**< interface ID */

    • WTAP_HAS_COMMENTS /**< comments */

    • WTAP_HAS_DROP_COUNT /**< drop count */

    • WTAP_HAS_PACK_FLAGS /**< packet flags */

The relevant code looks like this:

    /* Populate the wtapng_block */
    if (is_byte_swapped) {
        wtapng_block->packet_header->ts.secs = GUINT32_SWAP_LE_BE(tr_hdr->timestamp_high);
        wtapng_block->packet_header->ts.nsecs = GUINT32_SWAP_LE_BE(tr_hdr->timestamp_low);
        wtapng_block->packet_header->caplen = GUINT32_SWAP_LE_BE(block_data_len);
        wtapng_block->packet_header->len = GUINT32_SWAP_LE_BE(block_data_len);
    }
    else {
        wtapng_block->packet_header->ts.secs = tr_hdr->timestamp_high;
        wtapng_block->packet_header->ts.nsecs = tr_hdr->timestamp_low;
        wtapng_block->packet_header->caplen = block_data_len;
        wtapng_block->packet_header->len = block_data_len;
    }
    wtapng_block->internal = FALSE;
    wtapng_block->packet_header->interface_id = 0;
    wtapng_block->packet_header->drop_count = -1; /* invalid */
    wtapng_block->packet_header->caplen = block_data_len;
    wtapng_block->packet_header->len = block_data_len;
    wtapng_block->packet_header->pkt_encap = WTAP_ENCAP_USER11;
    wtapng_block->packet_header->presence_flags |= WTAP_HAS_TS;
    wtapng_block->packet_header->presence_flags |= WTAP_HAS_INTERFACE_ID;
    wtapng_block->packet_header->presence_flags |= WTAP_HAS_CAP_LEN;
    wtapng_block->packet_header->rec_type = REC_TYPE_PACKET;

See the complete code in Appendix B

Related information

Appendix A - XML Example

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<source>
        <header headerline="false" skipheaderlines="0">
                <description>Descriptor file for Apache access log in common format</description>
                <generator>Babel 3.0</generator>
                <gendate>2017-10-20</gendate>
                <gentime>19:18:22</gentime>
                <genzoffset>+1</genzoffset>
                <owner>Paul Offord</owner>
                <nativeformat>LogFormat "%h %l %u %t \"%r\" %>s %b" common</nativeformat>
                <example>192.168.1.87 - user01 [09/Jul/2012:08:25:35 +0100] "GET /Setup.php HTTP/1.1" 200 1824</example>
                <wsnamespace>apache</wsnamespace>
                <charencoding>ASCII</charencoding>
        </header>
        <records>
                <record type="1">
                        <eols enforce="true">
                                <eol>\n</eol>
                                <eol>\r\n</eol>
                        </eols>
                        <delimiters>
                                <delimiter>&nbsp;</delimiter>
                        </delimiters>
                        <missingvalues>
                                <missingvalue>-</missingvalue>
                        </missingvalues>
                        <criteria>
                                <criterium type="string" offset="*">*</criterium>
                        </criteria>
                        <columns>
                                <column>
                                        <informat quoted="false">%i</informat>
                                        <name>host</name>
                                        <abbrev>bds.apache.host</abbrev>
                                        <blurb>This is the IP address of the client (remote host) which made the request to the server.</blurb>
                                        <type quoted="false">FT_IPvx</type>
                                        <display>BASE_NONE</display>
                                        <bitmask>0</bitmask>
                                </column>
                                <column>
                                        <informat quoted="false">%s</informat>
                                        <name>identid</name>
                                        <abbrev>bds.apache.identid</abbrev>
                                        <blurb>The identity of the client determined by a request to the identd server on the clients machine.</blurb>
                                        <type quoted="false">FT_STRINGZ</type>
                                        <display>BASE_NONE</display>
                                        <bitmask>0</bitmask>
                                </column>
                                <column>
                                        <informat quoted="false">%s</informat>
                                        <name>userid</name>
                                        <abbrev>bds.apache.userid</abbrev>
                                        <blurb>This is the userid of the person requesting the document as determined by HTTP authentication.</blurb>
                                        <type quoted="false">FT_STRINGZ</type>
                                        <display>BASE_NONE</display>
                                        <bitmask>0</bitmask>
                                </column>
                                <column>
                                        <informat quoted="false" start-bracket="[" end-bracket="]">[%d/%b/%Y:%H:%M:%S %z]</informat>
                                        <name>datetime</name>
                                        <abbrev>bds.apache.datetime</abbrev>
                                        <blurb>The time that the request was received.</blurb>
                                        <type>EVENT_DATETIME</type>
                                        <display>BASE_NONE</display>
                                        <bitmask>0</bitmask>
                                </column>
                                <column>
                                        <informat quoted="true">%s</informat>
                                        <name>request</name>
                                        <abbrev>bds.apache.request</abbrev>
                                        <blurb>The request line from the client is given in double quotes.</blurb>
                                        <type>FT_STRINGZ</type>
                                        <display>BASE_NONE</display>
                                        <bitmask>0</bitmask>
                                </column>
                                <column>
                                        <informat quoted="false">%d</informat>
                                        <name>response code</name>
                                        <abbrev>bds.apache.response-code</abbrev>
                                        <blurb>This is the status code that the server sends back to the client.</blurb>
                                        <type>FT_UINT32</type>
                                        <display>BASE_DEC</display>
                                        <bitmask>0</bitmask>
                                </column>
                                <column>
                                        <informat quoted="false">%d</informat>
                                        <name>bytes returned</name>
                                        <abbrev>bds.apache.sc-bytes</abbrev>
                                        <blurb>This indicates the size of the object returned to the client, not including the response headers.</blurb>
                                        <type>FT_UINT32</type>
                                        <display>BASE_DEC</display>
                                        <bitmask>0</bitmask>
                                </column>
                        </columns>
                        <infofield>%4 - %5</infofield>
                </record>
        </records>
</source>

Appendix B - Sample Code

Note that this is still a very early cut - it's quite messy. I also need to update the code to handle a total field descriptors length field for the TSDB. This is because I need to add Options fields to the TSDB and these will come after the Field Descriptors.

The TRB format doesn't match that shown in Appendix D either. I will update the code as soon as possible.

trb_20180209.zip

Appendix C - Sample PCAP-NG File

This sample is a converted Apache HTTPD access log.

access_log.pcapng

Appendix D - Block Formats

The following are the target formats for the blocks. The sample file attached here doesn't quite match these formats. I will update the code soon.

TSDB

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +---------------------------------------------------------------+
 0 |                    Block Type = 0x80000010                    |
   +---------------------------------------------------------------+
 4 |                      Block Total Length                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 8 |            Version            |            Format             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
12 |          Scheme Index         |           Reserved            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
16 |                             GUID1                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
24 |                             GUID2                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
32 |                             GUID3                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
40 |                             GUID4                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
44 |                           FD Length                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
48 /                                                               /
   /                       Field Descriptors                       /
   /              variable length, padded to 32 bits               /
   /                                                               /
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   /                                                               /
   /                      Options (variable)                       /
   /                                                               /
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      Block Total Length                       |
   +---------------------------------------------------------------+

TRB

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +---------------------------------------------------------------+
 0 |                    Block Type = 0x80000011                    |
   +---------------------------------------------------------------+
 4 |                      Block Total Length                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 8 |            Version            |            Format             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
12 |          Scheme Index         |           Reserved            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
16 |                        Timestamp (High)                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
20 |                        Timestamp (Low)                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
24 /                                                               /
   /                           Text Data                           /
   /              variable length, padded to 32 bits               /
   /                                                               /
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      Block Total Length                       |
   +---------------------------------------------------------------+

Appendix E - Request for PCAP-NG Block Types

-----Original Message-----
From: pcap-ng-format [mailto:pcap-ng-format-bounces@winpcap.org] On Behalf Of pcap-ng-format-owner@winpcap.org
Sent: 22 January 2018 08:24
To: Paul Offord <Paul.Offord@advance7.com>
Subject: Your message to pcap-ng-format awaits moderator approval

Your mail to 'pcap-ng-format' with the subject

    Request for Two PCAP-NG Block Types

Is being held until the list moderator can review it for approval.

The reason it is being held:

    Post by non-member to a members-only list

Either the message will get posted to the list, or you will receive notification of the moderator's decision.  If you would like to cancel this posting, please visit the following URL:

    https://www.winpcap.org/mailman/confirm/pcap-ng-format/4605e8ba2f16580753f5279664949172c66a78bd

Adding Support for a New Block Type (last edited 2018-02-26 08:45:43 by PaulOfford)