File Formats of Micro Focus Cobol 
      
      This COBOL system provides three types of data file organization:
      relative, indexed and sequential. Additionally, sequential files fall
      into one of three categories: record sequential, printer sequential
      and line sequential.
      
      Record sequential, relative and indexed files can contain records
      that are either all of fixed length, or records that are of variable
      length. These files have fixed or variable format respectively. Printer
      sequential and line sequential files contain records that are implicitly
      variable length and have separate file formats.
      
      Record sequential, relative and indexed files allow two different
      formats: fixed and variable. The file format is specified explicitly
      or implicitly as described in the section Fixed and Variable Format
      later in this document. Fixed and variable format map indirectly onto
      two types of file structure: fixed and variable.
      
      The physical structure of each of these file types as they exist on
      disk is explained in this document.
      
      This information is provided for anyone who wants to understand the
      nature of data files produced by programs created using this COBOL
      system, or to process them outside the COBOL system where appropriate.
      It can also be useful for debugging programs. However, you do not
      need to understand these file structures to use data files from COBOL
      programs.
      
      You are advised not to process the files yourself using byte-stream
      I/O, but to use COBOL syntax or the file handler call interface (documented
      in an add-on product). This ensures that applications will function
      properly if file formats are enhanced or developed in the future.
      
 Fixed and Variable Format
      
      The format of the record sequential, relative and indexed file can
      be explicitly or implicitly fixed or variable.
      
      The file format is always fixed unless one of the following conditions
      is specified for the file or file record:
      
       - The RECORDING MODE IS V clause which always creates variable
         format.
      
       - The RECORD IS VARYING clause which creates variable format
         provided no RECORDING MODE IS F clause is present.
      
       - The OCCURS...DEPENDING ON clause which creates variable format
         when you set the NOODOSLIDE Compiler directive.
      
       - The RECMODE"V" Compiler directive which creates variable format
         for each file where no RECORDING MODE IS F or RECORD CONTAINS n
         CHARACTERS clauses are present.
      
       - The RECMODE"OSVS" Compiler directive which creates variable
         format for files that contain fixed record definitions of different
         lengths.
      
       - The data compression feature which creates variable format.
      
 Basic File Structures
      
      There are four basic structures used for all files: fixed, variable,
      line sequential and printer sequential.
      
      Fixed is the structure used by fixed format record sequential and
      relative files, and contains only fixed length records. The size of
      each record is equal to the length of the largest record definition
      for the file.
      
      Variable is the structure of variable format record sequential and
      relative files, and fixed and variable format indexed files. Variable
      structure files can contain fixed or variable length records.
      
      Line sequential is the structure of files with the line sequential
      organization. Line sequential files are designed to enable you to
      read source or text files created with the system editor. As such,
      the format is operating system dependent but typically contains variable
      length records with trailing spaces removed. See the section Line
      Sequential Organization later in this document for further details.
      
      Printer sequential is the structure of files that are destined for a printer,
      either directly or by later spooling of a disk file. They contain
      vertical and horizontal tab controls. The structure of these files
      reflects what is required to drive a printer and so is independent
      of the operating system. See the section Printer Sequential Files
      later in this document for further details.
      
      The following sections describe these file structures.
      
 Fixed Structure
      
      Fixed structure files contain no record or file header information.
      The records are all the same length, that length being determined by
      the longest record defined in the File Description (FD) in the program's
      File Section.
      
 Variable Structure
      
      Any files containing variable length records, with the exception of
      line sequential files and files destined for the printer, contain
      a block of 128 bytes of header information at the start of the file.
      Each record in the file is preceded by a 2- or 4-byte control field.
      The top 4 bits of this field indicate the status of the record. A
      value of 0100 in these bits means that this record is a user data
      record. Any other value means that this record has either been deleted
      or is used internally. The remainder of the control field contains
      the length of the record.
      
      For all files where the maximum record size is less than 4096 bytes
      (excluding the prefix), the prefix is 2 bytes long. For all other
      files, the prefix is 4 bytes long. Each record always starts on the
      next 4-byte boundary in the file.
      
      You must not alter the header information or the control fields in
      any way since these are maintained by this COBOL system.
      
 Record Header Types
      

      First 4 bits   Record type
__________________________________________________________________________
      1 (0001)       A system record (IDXFORMAT"4" files only).
                     This contains duplicate occurrence details in the data file.

      2 (0010)       Deleted record (available for reuse via the Free Space list).

      3 (0011)       System record.

      4 (0100)       User data record.

      5 (0101)       Reduced user data record (indexed files only).
                     The 16-bit word immediately following the data record, as
                     indicated by the length in the header, contains the space between
                     the end of the data record and the start of the next record header.

      6 (0110)       Pointer record (indexed files only).
                     The first 4 bytes following the record header contain the
                     offset in the file to the location of the user data record.

      7 (0111)       User data record referenced by a Pointer record.

      8 (1000)       Reduced user data record referenced by a Pointer record.
      

      The first record in every variable structure file is a system record
      called the File Header record. This is normally 128 bytes long.
      
      The record header for each record starts on a 4-byte boundary. Consequently,
      a record may be followed by up to three padding characters, usually
      spaces. These padding characters are not included in the record length.
      

                Variable structure File Header record description:

       Offset Size   Description of the field
_________________________________________________________________________________
       0       4     Length of the file header.
                     The first 4 bits are always set to 3 (0011 in binary) indicating
                     that this file header record is a system record.
      
                     The remaining bits contain the length of the file header
                     record. If the maximum record length is less than 4095 bytes, the
                     length is 126 and is held in the next 12 bits; otherwise it is 124
                     and is held in the next 28 bits. Hence, in a file where the maximum
                     record length is less than 4095 bytes, this field contains x"30 7E
                     00 00". Otherwise, this field contains x"30 00 00 7C".

        4       2    Database sequence number, used by add-on products supplied
                     with this COBOL system.
      
        6       2    Integrity Flag. Indexed files only.      
                     If this is non-zero when the header is read, it indicates
                     that the file is corrupt.
      
        8      14    Creation date and time in YYMMDDHHMMSSCC format. Indexed
                     files only.
        
       22      14    Reserved.
      
       36      2     Reserved. Value 62 decimal; x"00 3E".
      
       38      1     Not used. Set to zeros.
      
       39      1     Organization.
                          1  =  Sequential
                          2  =  Indexed
                          3  =  Relative
      
       4       1     Not used. Set to zeros.
      
       41      1     Data compression routine number.
                       0           =    No compression
                       1           =    CBLDC001
                       2-127       =    Reserved for internal use
                       128-255     =    User-defined compression routine number
      
       42      1     Not used. Set to zeros.
      
       43      1     File format.
                       0       =       Default
                       1       =       C-ISAM
                       2       =       LEVEL II COBOL
                       3       =       Indexed file format used
                                          by this COBOL system
                       4       =       IDXFORMAT"4"
      
       44      4     Reserved.
      
       48      1     Recording mode.
                       0       =       Fixed format
                       1       =       Variable format
                     For indexed files, the Recording Mode field of the index
                     file takes precedence.
      
       49      5     Not used. Set to zeros.
      
       54      2     Not used. Set to zeros.
      
       56      2     Maximum record length.
                     Example: with a maximum record of length 80 characters, this
                     field will contain x"00 50".
      
       58      2     Not used. Set to zeros.
      
       60      2     Minimum record length
                     Example: with a maximum record of length 2 characters, this
                     field will contain x"00 02".
      
       62      46    Not used. Set to zeros.
      
      108      4     Version and build data for the indexed file handler creating
                     the file. Indexed files only.
      
      112      16    Not used. Set to zeros.
      

 Structure of Each File Organization
      
      The following sections describe the physical structure of the four
      data file organizations.
      

 Sequential Organization
      
      Two types of sequentially organized file are available under this
      COBOL system: record sequential and printer sequential. These file
      types are described in the following sections.
      

 Record Sequential Files
      
      Record sequential files are intended to cater for binary data. These
      files consist of a series of either fixed or variable length records.
      The order of records in these files is set by the order of WRITE statements
      when the file is created. The record order does not change once it
      has been set. New records are added to the end of the file. Each record
      in a record sequential file (except the first record) has a unique
      record which precedes it, while each record (except the last record)
      also has a unique record that follows it.
      
      Record sequential files that are fixed length and are not destined
      for the printer have no record delimiter; the end of one record is
      immediately followed by the beginning of the next.
      

 Fixed Format Record Sequential Structure
      
      In a fixed format record sequential file, each record immediately
      follows the previous record in the file. Each record is the same length
      as the maximum length record.
      
      +--------------------------------------------+
      |  Fixed length record                       |
      +--------------------------------------------+
      |  Fixed length record                       |
      +--------------------------------------------+
      .                                            .
      .                                            .
      +--------------------------------------------+
      |  Fixed length record                       |
      +--------------------------------------------+
      

 Variable Format Record Sequential Structure
      
      In a variable format record sequential file each record written is
      preceded by a record header containing the length of the record; the
      record is written at the length defined in the program; the file contains
      a standard variable structure file header record.
      
      Up to three padding characters can follow a record to ensure that
      the next record starts on a four-byte boundary.
      
      +----------------------------------------------+
      | File Header record - 128 bytes               |
      |                                              |
      +--------+----------------------------+---+----+
      | Header | Variable length record     |   |
      +--------+----------------------------+---+--------+--+
      | Header | Variable length record                  |  |
      +--------+-----------------------------------------+--+
      .        .                  .                   .
      .        .                  .                   .
      +--------+-------------------------------------+---+
      | Header | Variable length record              |   |
      +--------+-------------------------------------+---+
      

 Printer Sequential Files
      
      You can define a sequential file as a printer sequential or
      printer destined file by specifying one of the following clauses:
      
       - LINE ADVANCING in the SELECT statement
      
       - ASSIGN TO PRINTER
      
      For printer sequential files, specifying the WRITE statement without
      the BEFORE or AFTER clause has the same effect as if you had specified
      AFTER 1. Specifying the WRITE statement with the BEFORE or AFTER clauses
      gives explicit vertical positioning which you must only use for files
      destined for the printer. Using these clauses for any other type of
      file will generally corrupt the file.
      
      Printer sequential files should not be opened for INPUT or I/O.
      
      Printer sequential file format consists of a sequence of print records
      which are terminated by a carriage return (x"0D") with zero or more
      vertical positioning characters between the print records.
      
      A print record consists of zero or more printable characters.
      
      The OPEN statement causes a x"0D" to be written to the file to ensure
      that the printer is located at the first character position before
      printing the first data record.
      
      The WRITE statement causes trailing spaces to be removed from the
      record before it is written to the printer with a terminating x"0D".
      
      The BEFORE or AFTER clause specified in the WRITE statement causes
      one or more line-feed characters (x"0A"), a form-feed character (x"0C"),
      or a vertical tab character (x"0B") to be sent to the printer after
      or before writing the data record.
      

 Printer Sequential Structure
      
      +----+
      | 0D |
      +----+
      | 0A |
      +----+------------------------------------------+----+
      |Print record                                   | 0D |
      +----+----+----+--------------------------------+----+
      | 0A | 0A | 0A |
      +----+----+----+--------------------------+----+
      |Print record                             | 0D |
      +--------------------------------+----+---+----+
      |Print record                    | 0D |
      +----+---------------------------+----+
      | 0C |
      +----+
      | 0D |
      +----+----+
      | 0A | 0A |
      +----+----+-----------------------------+----+
      |Print record                           | 0D |
      +---------------------------------------+----+
         .     .     .   .     .     .
      

 Line Sequential Organization
      
      Line sequential files are implemented to be consistent with your system
      editor and any other similar utilities that use text files. They are
      strictly operating system dependent; however, the scheme used by the
      PC-DOS, OS/2 and UNIX operating systems is widely used and is described
      here.
      
      A record delimiter is written after every record. The delimiter character(s)
      vary depending on your operating environment. See the environment
      specific sections for line sequential files below for further information.
      
      Line sequential files hold variable length text records, each containing
      zero or more displayable or non-displayable characters. A WRITE statement
      removes trailing spaces from the data record then adds the system
      record delimiter. A READ statement removes the record delimiter and
      if necessary pads the record area with trailing spaces or returns
      surplus text as following records. Each text record is followed by
      a record delimiter chosen by the operating system to be consistent
      with your system editor. The record delimiter varies depending on
      your operating system. See the environment specific sections for line
      sequential files below for further information.
      
      A line sequential file must not be described as a printer destined
      file and must not use the BEFORE or AFTER clause in the WRITE statement.
      
      System editors expect text to contain only displayable characters.
      However, line sequential files allow non-displayable characters with
      a value of less than x"20" (space) to be written to and read from
      them.
      
      During a WRITE operation, non-displayable characters in the record
      area are written to the file, each with a preceding LOW-VALUES or
      null character (x"00") to show that they are not text characters.
      A READ operation on the file removes the preceding LOW-VALUES characters
      added during the WRITE operation. You can prevent null insertion when
      writing to the file either by specifying the -N run-time
      switch, or by a call to functions 46 or 47 of routine x"91" to turn
      the N switch on or off, respectively, for a particular file.
      
      During a WRITE operation, any tab characters in a line sequential
      file (x"09") are expanded to every eighth character position; that
      is, the character following a tab will be in one of the columns 9,
      17, 25, 33, and so on. You can compress space characters to tabs during
      output using either the +T run-time switch, or a call
      to function 48 or 49 of routine x"91" to turn the T switch on or off,
      respectively, for a particular file.
      

 Line Sequential Files on DOS, Windows and OS/2
      
      The record delimiter x"0D0A" is used on DOS, Windows and OS/2.
      
      Any single byte x"1A" (user terminate run code) is used as an unconditional
      file terminator (except when preceded by a null character, as described
      below). If no x"1A" character is encountered, the physical end of
      the file serves as the file terminator.
      
      When the file is closed, a terminating x"1A" character is NOT written.
      Instead, the length of the file is used to determine where it ends.
      
      On input, this COBOL system uses just the x"0A" as the record delimiter.
      Additional device control characters (such as x"0D", x"0B", x"0C")
      are discarded. x"1A" acts as a record delimiter and also denotes the
      end of the file.  If you turn the N run-time switch off, you must
      make sure that any COMP data does not contain bytes with a value of
      x"1A" (end-of-file character) or x"0D" (record delimiter).
      

 Line Sequential Files on UNIX
      
      The record delimiter on UNIX is a single byte x"0A" (the default).
      However, for line sequential and relative files only, this default
      record delimiter can be changed to that used by DOS, Windows and OS/2.
      
      If you turn off the N run-time switch (-N), you must make sure that
      any COMP data does not contain bytes with a value of x"0A" (record
      delimiter).
      

 Line Sequential Structure
      
      +-----------------------------------------------+---------+
      | Variable length record                        |delimiter|
      +----------------------------------+---------+--+---------+
      | Variable length record           |delimiter|
      +----------------------------------+---------+--+---------+
      | Variable length record                        |delimiter|
      +-----------------------------------------------+---------+
       .        .                           .
       .        .                           .
       .        .                           .
      +----------------------------------------------+---------+
      | Variable length record                       |delimiter|
      +----------------------------------------------+---------+
      

 Relative Organization
      
      Relative file organization enables you to access any record randomly
      by specifying its ordinal position within the file. Data held in relative
      files can consist of fixed or variable format records which are of
      fixed length, the length being the length of the longest record defined
      for the file. This is necessary so that the COBOL file handling routines
      can quickly calculate the physical location of any record given its
      record number within the file.
      
      Each record is uniquely identified by a record number. The first record
      in the file is record number one, the second record is number two,
      and so on.
      
      Each record is followed by a record marker unless it is a variable
      length file which indicates the current state of the record. In a
      variable format file, the marker follows the fixed length slot. The
      marker varies depending on your environment. See the environment specific
      information sections for relative files below for further information.
      
      When you delete a record from a relative file, the only action is
      to change that record's marker. However, the contents of a deleted
      record physically remain in the file until a new record is written.
      If, for security reasons, you want to make sure that the data does
      not exist in the file, then you must overwrite the record using the
      REWRITE statement before you delete it.
      
      A fixed format relative file can be processed as a fixed format sequential
      organization file by defining the maximum record length to be larger
      than that for the relative file (see the sections on operating environment
      specific information for details). A variable format relative file
      cannot be processed as a sequential organization file.
      
      The length of a relative file is determined by the largest record
      number used when actually writing a record to the file.
      

 Relative File Organization on DOS, Windows and OS/2
      
      On DOS, Windows and OS/2, the current state of the record is indicated
      by a two-byte marker as follows:
      

      Marker (hex)   Description
      __________________________________
      
      0D0A           Record present
      
      0D00           Record deleted or never written.
      
      A fixed format relative file can be processed as a fixed format sequential
      file by defining the maximum record length to be two characters larger
      than that for the relative file.
      
      The size of a relative file on DOS, Windows and OS/2 is calculated
      as follows.
      
      Fixed format:
      
         (max-rec-len + 2) * largest-record-number
      
      Variable format:
      
         128 + (max-rec-len + 2 + header) * largest-record-number
      
      where header is 2 if max-rec-len is less than 4096, otherwise header is 4.
      

 Relative File Organization on UNIX
      
      On UNIX, the current state of a record for fixed length relative records
      is indicated by a one-byte marker as follows:
      
      Marker (hex)   Description
      __________________________________
      
      0A             Record present
      
      00             Record deleted or never written
      
      The current state of a record for variable length relative records
      is indicated by a two-byte marker as follows:

      Marker (hex)   Description
      
      0D0A           Record present
      
      0D00           Record deleted or never written
      
      A fixed format relative file can be processed as a fixed format sequential
      file by defining the maximum record length to be one character larger
      than that for the relative file.
      
      The size of a relative file on UNIX systems is calculated as follows.
      
      Fixed format:
      
         (max-rec-len + 1) * largest-record-number
      
      Variable format:
      
         128 + (max-rec-len + 2 + header) * largest-record-number
      
      where header is 2 if max-rec-len is less than 4096, otherwise header is 4.
      

 Fixed Format Relative Structure
      
      A fixed format relative file is the same as a fixed format sequential
      file, except each record is followed by a record marker.
      
      +-------------------------------------------+------+
      |   Fixed length record - Record 1          |marker|
      +-------------------------------------------+------+
      |   Fixed length record - Record 2          |marker|
      +-------------------------------------------+------+
      .                                         .    .
      .                                         .    .
      +-------------------------------------------+------+
      |   Fixed length record - Record i  deleted |marker|
      +-------------------------------------------+------+
      .                                         .    .
      .                                         .    .
      +-------------------------------------------+------+
      |   Fixed length record - Record j - unused |marker|
      +-------------------------------------------+------+
      .                                         .    .
      .                                         .    .
      +-------------------------------------------+------+
      |   Fixed length record - Record n          |marker|
      +-------------------------------------------+------+
      
    UNIX
      For relative files in random access, writing records 1, 2 and 9 will
      occupy the same disk space as creating a file containing records 1,
      2 and 3 on UNIX.
