Table of Contents
Import data
How to write an importer for data formats not yet supported by the toolbox?
Introduction
The trEPR Toolbox is modular by design. That means that it is fairly easy to add your own importer for whatever data format you intend to read. Of course, it is your responsibility to decide whether the toolbox fits reasonably well to the type of data you want to import. Besides that, if you follow a few relatively simple requirements (as the importer is the one interface between the real world and the self-contained and idyllic world of the toolbox), it should be pretty straight-forward.
Please note: Although being far from complete, the following might help developers that want to write importers for data formats that are not yet supported by the toolbox.
A few facts about the toolbox that are relevant for writing an importer:
- The toolbox uses a clearly defined structure (Matlab™ struct) to save the data internally.
- All importers should output such a structure.
- The wrapper functions should be placed in the “IO” directory of the toolbox.
- The “wrapper” for all importer functions is
trEPRload
. - Add a new entry in the file
trEPRload.ini
to add your new importer to the list of supported formats. That makes it as well appearing inside the GUI (as the GUI load panel automatically reads the supported formats from thetrEPRload.ini
file). How to add such an entry is explained in thatini
file.
There are a few tools that help you to get used to the data structure. See the next section for that.
<note important>Please note: The trEPR Toolbox data structure was designed having trEPR data in mind (both with and without magnetic fields applied). Therefore, it might not be ideal for other types of data as well. But most certainly there will be ways to extend the data structure if necessary/useful. In such case, please let me know in advance, so that we can agree about the changes to make, to keep it as consistent as possible.</note>
Tools
There are a few tools that might help to get used to the data structure, to write importer routines, and even to validate their output.
trEPRdataStructure
The function trEPRdataStructure
helps you with getting to know the trEPR Toolbox data structure.
To get to know how to use this function:
help trEPRdataStructure
trEPR Toolbox data structure
To get a Matlab™ struct with all the fields of the trEPR Toolbox data structure, type:
dataStructure = trEPRdataStructure('structure')
If you want to have something similar, i.e. a Matlab™ struct with all the fields, but each field with a string that tells you what type the actual field is of (in case that it is not a struct itself), use the following syntax:
dataModel = trEPRdataStructure('model')
Validate your own data structure
Suppose that you have already written an importer (here named myImporter
) that returns some struct with data. To test whether your struct complies to the trEPR Toolbox data model, use the following syntax:
% Read your data data = myImporter(...); % Validate your data structure for complying with trEPR Toolbox data structure [missingFields,wrongType] = trEPRdataStructure('check',data)
Now, missingFields
should contain the fields missing in your structure, and wrongType
should list you each field in your struture that appears to have the wrong data type.
Tips
A few hopefully helpful ideas for writing importers.
Enhancing compatibility with further toolbox development
A tip for maximum compatibility of your importer to the trEPR Toolbox data structure: Inside your importer, get the trEPR Toolbox data structure by calling once the trEPRdataStructure
function and selectively fill this structure with your data.
That helps you being as compatible as possible with the toolbox data structure, as this may slightly change over time.
Meta data
If you are interested a bit more in all the informations that get stored in the data structure of the toolbox and where all the fields may come from, have a look at the info file structure.
The idea behind such “info files” is to mimic a labbook record. Admittedly, the author of the trEPR Toolbox is rather lazy in this respect, and therefore, to make life easier, the info files provide you with a scaffold of all the necessary informations that you should collect during the experiment, especially when working with a lab-built setup that gets changed quite often over time.1)
Template for a new importer
To make life easier, following is a list of important things and design principles of importer routines of the toolbox, and below that, a skeleton of a Matlab™ function for a new importer.
Please note: This is not to restrict your way of coding, it is just to make life easier to start with, and it may lead to a more consistent code of the toolbox.
Of course, the TIMTOWTDI2) principle still applies.
Design principles and tips
A few rather important things to note:
- Both input and output parameters are fixed.
- Input:
fileName
,varargin
- Output:
data
,warnings
varargin
represents a list of optional parameters, two of them are particularly important:3)combine
(logical)
Whether to combine files.checkFormat
(logical)
Whether to check for the correct format of the file.
- If something goes wrong while reading the file, the importer should do the following:
- Return empty numerical value
[]
indata
- Return a string/cell array in
warnings
explaining what went wrong (e.g., wrong format, file does not exist, no (valid) filename given).
- If everything went well,
data
should be according to the trEPR Toolbox data structure andwarnings
empty. - If more than one dataset has been loaded (if this is supported by the importer),
data
is a cell array of structures that validate against the trEPR Toolbox data structure. - Generally, all importers should be able to handle cell arrays, structures and strings as file names.
- A code listing of how to handle this can be found below.
- It is good practice to use a subfunction “loadFunction” for the actual loading. Have a look at the source code of
trEPROXload.m
for an example.
Coding example
Following a skeleton of a Matlab™ function for a new importer. To get an idea of how an actual importer routine can look like, have a look at trEPROXread.m
for example.
Please don't be scared by what looks like a tremendous overhead. Most parts are reasonably well documented that it should be obvious what they are good for. And please have in mind: As the importer gets used in combination with the GUI, a certain level of failsave behaviour is required.
- "trEPRxyformatRead.m"
function [data,warnings] = trEPRxyFormatRead(fileName,varargin) % trEPRXYFORMATREAD Read xy format files (binary) % % Usage % data = trEPRxyFormatRead(fileName) % [data,warnings] = trEPRxyFormatRead(fileName) % data = trEPRxyFormatRead(fileName,...) % % ... add description of parameters here... % % See also: trEPRload, trEPRdataStructure % (c) 20xx, <Developer's name> % 20xx-xx-xx % Parse input arguments using the inputParser functionality p = inputParser; % Create an instance of the inputParser class. p.FunctionName = mfilename; % Function name to be included in error messages p.KeepUnmatched = true; % Enable errors on unmatched arguments p.StructExpand = true; % Enable passing arguments in a structure p.addRequired('fileName', @(x)ischar(x) || iscell(x) || isstruct(x)); % p.addOptional('parameters','',@isstruct); p.addParamValue('combine',logical(false),@islogical); p.addParamValue('sortfiles',logical(true),@islogical); % Note, this is to be compatible with trEPRload - currently without function! p.addParamValue('checkFormat',logical(true),@islogical); p.parse(fileName,varargin{:}); % Assign optional arguments from parser combine = p.Results.combine; warnings = cell(0); % If no filename given if isempty(fileName) data = []; warnings{end+1} = 'No filename.'; return; end % Handling different data types of fileName parameter if iscell(fileName) if sortfiles sort(fileName); end elseif isstruct(fileName) % That might be the case if the user uses "dir" as input for the % filenames, as this returns a structure with fields as "name" if ~isfield(fileName,'name') data = []; warnings{end+1} = 'Cannot determine filename(s).'; return; end % Convert struct to cell fileName = struct2cell(fileName); fileName = fileName(1,:)'; % Remove files with leading '.', such as '.' and '..' fileName(strncmp('.',fileName,1)) = []; if sortfiles sort(fileName); end else % If filename is neither cell nor struct % Given the input parsing it therefore has to be a string if exist(fileName,'dir') % Read directory fileName = dir(fileName); % Convert struct to cell fileName = strut2cell(fileName); fileName = fileName(1,:)'; % Remove files with leading '.', such as '.' and '..' fileName(strncmp('.',fileName,1)) = []; if sortfiles sort(fileName); end elseif exist(fileName,'file') % For convenience, convert into cell array fn = fileName; fileName = cell(0); fileName{1} = fn; else % If "filename" is neither a directory nor a file... % Check whether it's only a basename fileName = dir([fileName '*']); if isempty(fileName) data = []; warnings{end+1} = 'No valid filename.'; return; end % Convert struct to cell fileName = struct2cell(fileName); fileName = fileName(1,:)'; % Remove files with leading '.', such as '.' and '..' fileName(strncmp('.',fileName,1)) = []; if sortfiles sort(fileName); end end end % Add your code here function [data,warnings] = loadFile(fileName) % LOADFILE Load file and return contents. % % fileName - string % Name of a file (normally including full path) % % data - structure % According to the toolbox data structure % % warnings - cell array of strings % Contains warnings if there are any, otherwise empty. % A few important settings % Name of the format as it appears in the file.format field formatNameString = '<your format specifier string - may contain spaces>'; % Add code for actually importing your data here
Testing your new importer
Before you add your new file format to the toolbox (see below), please test your importer thoroughly, e.g. by validating the output it creates.
See above for how to validate your output using trEPRdataStructure
.
Other things you may (and should) test for:
fileName
being a string, a structure (such as returned when usingdir
), a cell array- Test what happens if you try to load files with a different format than what you actually wrote your importer for.
- Use invalid file names.
In every case, the routine should “exit gracefully”, meaning that if something goes wrong, it should still return, with an empty vector as data
and a string/cell array in warnings
that tells the user what may have gone wrong.
Given enough time (and the need for it), there might even be a test routine for new importers at one point in the future4). Until then, please help yourself following the tips layed out above.
Adding your importer to trEPRload.ini
Once you've written your importer routine and thoroughly tested it to comply to the trEPR Toolbox data structure, you may add it to the trEPRload.ini
file to make it accessible from within the GUI.
Following is an excerpt of the trEPRload.ini
file describing an entry for a supported file format:
% Configuration file for the trEPRload function of the trEPR toolbox % % (c) 2011-12, Till Biskup <till@till-biskup.de> % % Each file format that is recognized by the trEPRload function % has its own entry in this file. The format of this entry is as follows: % % [<file format>] % name = short name of the format (used to identify it) % description = more detailed description % type = <ascii|binary> % identifierString = <string that can be used to identify the file> % fileExtension = file extension(s) (if a list, separate by "|") % function = <function that is used to handle the file> % multipleFiles = <true|false> whether format consists of multiple files % parameters = <additional parameters passed to the function> % combineMultiple = <true|false> whether routine can combine multiple files
If you are still in doubt what several of these fields may be used for, have a look at the complete trEPRload.ini
file as such, or, if that doesn't help, ask the toolbox author.
Please note: The “<file format>” identifier has to be unique and a single word.
Every file format defined in the trEPRload.ini
file gets automatically recognised by the GUI. That means that you can select it in the Load panel. The string that appears in the popup menu in that panel is determined by the field “name” in the trEPRload.ini
file.
trEPRload
routine, at least if called from inside the GUI.