On Nov 16, 2010, at 5:02 PM, Jeremy Begg wrote:
Thanks for the feedback, Jeremy.
I'm assuming that if the record format is FAB$C_VAR or FAB$C_VFC, the
records will never contain binary data with embedded newlines. Is
that true? What other assumptions am I making that I shouldn't?
The great advantage of the VAR and VFC formats (compared to the
various
Stream formats) is that you *can* have embedded newlines and other
non-
printable characters in the record. So it's quite conceivable that a
program might right "binary" data to a VAR or VFC file. On the other
hand it would not be common Perl practice to do so.
Is it possible using the CRTL? I'm not sure it's reasonable to expect
Perl to do better than C can do. Here's what seems to me like the
relevant section from the CRTL manual:
=====
1.8.2.2.1 Accessing Variable-Length or VFC Record Files in Record Mode
When you access a variable-length or VFC record file in record mode,
many I/O
functions behave differently than they would if they were being used
with stream
mode. This section describes these differences.
In general, the new-line character (\n) is the record separator for
all record
modes. On output, when a new-line character is encountered, a record is
generated unless you specify an optional argument (such as "ctx=bin" or
"ctx=xplct") that affects the interpretation of new lines.
=====
I'm pretty sure what I'm proposing will do in Perl exactly what the
CRTL says it will do, but, as always, I'm grateful when people catch
my mistakes before I'm finished making them.
FAB$L_CTX is not available in the stat buffer and thus would be
difficult to get at, and in fact we know in this case what the open
call looked like and that it did not include "ctx=bin". I think the
following demonstrates that you cannot embed newlines in a record
using unix I/O from C, at least without ctx=bin:
$ type record_test.c
#include <unixio.h>
#include <unistd.h>
#include <file.h>
#include <stdlib.h>
main()
{
int file;
int flags;
flags = O_RDWR | O_CREAT;
file = open("record_test.dat", flags, 0, "rfm=var");
if (file == -1)
perror("open failed"), exit(1);
if (write(file, "This record contains tab (\t) and newline(\n) in
the middle\n", 58) < 0)
perror("write failed"), exit(1);
close(file);
}
$ cc record_test
$ link record_test
$ r record_test
$ type record_test.dat
This record contains tab ( ) and newline(
) in the middle
$ dump/record record_test.dat
Dump of file D0:[CRAIG.TEST]RECORD_TEST.DAT;1 on 16-NOV-2010 22:08:19.68
File ID (116860,63,0) End of file block 1 / Allocated 33
Record number 1 (00000001), 42 (002A) bytes, RFA(0001,0000,0000)
646E6120 29092820 62617420 736E6961 746E6F63 2064726F 63657220
73696854 This record contains tab (.) and 000000
0A28 656E696C
77656E20 newline(....................... 000020
Record number 2 (00000002), 16 (0010) bytes, RFA(0001,0000,002C)
0A656C64 64696D20 65687420
6E692029 ) in the middle................. 000000
When I'm writing programs I assume that each 'print' statement will
result
in one output record. I susppose you could arrange for each 'print'
to be
followed by a flush to disk at that point, but of course performance
would
probably suffer.
What about printing a list? Should each list item produce a new
record, or should all elements be coalesced into a single (potentially
humongous) write because they all came from one print statement? I'm
not sure a print statement that can function as a list operator maps
very well onto your assumption; Perl ain't Fortran.
Cheers,
Craig
________________________________________
Craig A. Berry
mailto:craigbe...@mac.com
"... getting out of a sonnet is much more
difficult than getting in."
Brad Leithauser