Starting in the Q2/15 release, FlowTraq allows an optional third stage of session storage, called the archive. In a typical configuration, the archive will be the largest but slowest of the three caches (the other two being the original RAM cache and Session Database). This may be network storage, or (if the SESSIONDB is backed by SSD) cheaper/slower spinning disk. Archives are a purely optional extension to the FlowTraq database, and while it is recommended to add them on setup or as part of the upgrade process, they can be added at any later date.
When planning an archive for a FlowTraq install, there are several considerations to keep in mind:
Archives are potentially much slower than regular storage, even when located on identical media, due to differences in how archived data is stored and retrieved. Typically archive should cover a span of history for which data may be needed (such as for compliance purposes) but will not be accessed frequently.
Whenever queries go to archive (just as when they go to disk in regular storage), all workers will access their archives at the same time. Network storage may be used for archive, but the effect on bandwidth of this parallel-access behavior should be taken into account.
FlowTraq archives are compatible with a number of internal and external storage options and file systems, but are particularly well-suited to compressed file systems such as ZFS.
When deploying in a cluster, the same considerations apply to archives in terms of sizing and consistency across cluster members as with regular storage. The archive can only be considered as historically accurate as its smallest component, meaning that if two workers each have five weeks in archive history, but one worker only has one day, then the archive is only forensically accurate to one day.
Unlike regular storage, FlowTraq is able to tolerate the media underlying the archive being unavailable periodically; in such a condition, sessions simply are not written.
FlowTraq support is available to help think through the deployment of these systems. Contact support@flowtraq.com or your sales representative.
Once the hardware has been determined and attached, mount the drive on the host running FlowTraq, ensuring The archive storage format is the same as the original session storage format, and can (and should) be "primed" with a copy of the original SESSIONDB, including all .dat files.
Once the space is prepared, open the
flowtraq.conf
file in the FlowTraq install directory for editing, and enter a block named 'archive' with two values:
- databasepath
The full or relative path to the directory in which FlowTraq should place the files. If the highest-level directory does not exist (e.g. 'ARCHIVE' in '/opt/flowtraqarchive/ARCHIVE'), FlowTraq will attempt to create it.
- autoconfigure
The maximum desired size of the archive, typically listed in GB.
When finished, the block should look similar to the following example:
Once the archive settings are correct, save the configuration file and restart the FlowTraq server. At this point, FlowTraq will save its current data and create the initial archive files (if none were provided by copying SESSIONDB).
After creating each archive in a cluster, and again when the entire process is complete, verify in the FlowTraq cluster configuration page that the archive information appears in the configuration box for the relevant worker. Information on total capacity can be found in the portal node in the line below that for regular storage.