On Nov 16, 2010, at 5:02 PM, Jeremy Begg wrote:

Thanks for the feedback, Jeremy.

I'm assuming that if the record format is FAB$C_VAR or FAB$C_VFC, the
records will never contain binary data with embedded newlines.  Is
that true?   What other assumptions am I making that I shouldn't?

The great advantage of the VAR and VFC formats (compared to the various Stream formats) is that you *can* have embedded newlines and other non-
printable characters in the record.  So it's quite conceivable that a
program might right "binary" data to a VAR or VFC file.  On the other
hand it would not be common Perl practice to do so.

Is it possible using the CRTL? I'm not sure it's reasonable to expect Perl to do better than C can do. Here's what seems to me like the relevant section from the CRTL manual:

=====
1.8.2.2.1 Accessing Variable-Length or VFC Record Files in Record Mode

When you access a variable-length or VFC record file in record mode, many I/O functions behave differently than they would if they were being used with stream
mode. This section describes these differences.

In general, the new-line character (\n) is the record separator for all record
modes. On output, when a new-line character is encountered, a record is
generated unless you specify an optional argument (such as "ctx=bin" or
"ctx=xplct") that affects the interpretation of new lines.
=====

I'm pretty sure what I'm proposing will do in Perl exactly what the CRTL says it will do, but, as always, I'm grateful when people catch my mistakes before I'm finished making them.

FAB$L_CTX is not available in the stat buffer and thus would be difficult to get at, and in fact we know in this case what the open call looked like and that it did not include "ctx=bin". I think the following demonstrates that you cannot embed newlines in a record using unix I/O from C, at least without ctx=bin:

$ type record_test.c
#include <unixio.h>
#include <unistd.h>
#include <file.h>
#include <stdlib.h>

main()
{
    int file;
    int flags;

    flags = O_RDWR | O_CREAT;

    file = open("record_test.dat", flags, 0, "rfm=var");
    if (file == -1)
        perror("open failed"), exit(1);

if (write(file, "This record contains tab (\t) and newline(\n) in the middle\n", 58) < 0)
        perror("write failed"), exit(1);

    close(file);
}
$ cc record_test
$ link record_test
$ r record_test
$ type record_test.dat
This record contains tab (      ) and newline(
                                              ) in the middle

$ dump/record record_test.dat

Dump of file D0:[CRAIG.TEST]RECORD_TEST.DAT;1 on 16-NOV-2010 22:08:19.68
File ID (116860,63,0)   End of file block 1 / Allocated 33

Record number 1 (00000001), 42 (002A) bytes, RFA(0001,0000,0000)

646E6120 29092820 62617420 736E6961 746E6F63 2064726F 63657220 73696854 This record contains tab (.) and 000000 0A28 656E696C 77656E20 newline(....................... 000020

Record number 2 (00000002), 16 (0010) bytes, RFA(0001,0000,002C)

0A656C64 64696D20 65687420 6E692029 ) in the middle................. 000000


When I'm writing programs I assume that each 'print' statement will result in one output record. I susppose you could arrange for each 'print' to be followed by a flush to disk at that point, but of course performance would
probably suffer.


What about printing a list? Should each list item produce a new record, or should all elements be coalesced into a single (potentially humongous) write because they all came from one print statement? I'm not sure a print statement that can function as a list operator maps very well onto your assumption; Perl ain't Fortran.

Cheers,
  Craig
________________________________________
Craig A. Berry
mailto:craigbe...@mac.com

"... getting out of a sonnet is much more
 difficult than getting in."
                 Brad Leithauser

Reply via email to