Re: buggy KEYIN; MasterBASIC's tokenized format; buggy RENUM

2012-11-16 Thread Simon Owen
On 15 Nov 2012, at 22:57, Marcos Cruz wrote:
 What do you mean auto-typed? Text spooling?

Yep, that's it!  It's currently a Windows-only feature at the moment, but I'll 
extend it to support other platforms.


 AFAIK SimCoupe lacks a file spooling option (in fact it is what I need: 
 SimCoupe to type the
 content of a text file of the host machine).

I'll add file spooling for non-Windows platforms, and clipboard paste once SDL 
2.0 is supported.  The Windows version will likely remain clipboard only, since 
sniffing the encoding from text files is just too unreliable.


 there are many labels in the imported KEYIN-ed code. I'll remove them too 
 --just
 to see what happens.

I'll be interested to hear how you get on with this.  If it does fix it, labels 
are definitely to be avoided until the ROM issue is tracked down and 
(hopefully) fixed.

Si



text spooling

2012-11-16 Thread Marcos Cruz
En/Je/On 2012-11-16 14:41, Simon Owen escribió / skribis / wrote :

 I'll add file spooling for non-Windows platforms, and clipboard paste
 once SDL 2.0 is supported.  

Great! It will be possible to code in MasterBASIC with a modern editor. Cannot 
wait to try it :)

 The Windows version will likely remain
 clipboard only, since sniffing the encoding from text files is just
 too unreliable.

I don't use Windows, but your comment about file encoding makes me think
the text is converted before auto-typing it. Isn't it? I mean
non-ASCII characters.

When I wrote my MBim toolkit I considered how to write SAM-specific
characters in the source (e.g. block graphics and UDG) and non-ASCII
charers whose code is different from the current 8-bit standards like
ISO-8859-1. I solved the first problem with the simple notation used by
BASin (the old ZX Spectrum IDE for Windows). For the second, I simply
used ISO-8859-1 in the source. Then my Vim converter translated
everything, the BASin notation and the ISO-8859-1 non-ASCII characters,
to the actual SAM characters. The text was ready to be KEYIN-ed.  So far
so good.

But I think automatic translation during spooling has some drawbacks:
first, it would be useful only for non-ASCII characters (mainly, foreign
language letters) provided by the SAM (mainly, by MasterBASIC) --or for
all characters, in case the file is encoded in any ASCII-incompatible
format, e.g. UTF-16, what is not common; second, it could ruin an ad hoc
character translation done by the programmer in the source.

Example: The charset provided by MasterBASIC lacks four Spanish letters
(uppercase Á, Í, Ó and Ú). If I use them in the source (I mean, in the
texts managed by the program; the comments are irrelevant), I'd have to
choose what character codes must represent them, how to translate them
before spooling and finally how to design those missing chars as UDG.
Automatic translation doesn't help because those characters are not part
of the SAM charset, and my own characters codes could be misunderstood
by the file spooler as part of an UTF-8 multibyte character. 

Therefore, in my opinion, a simple and versatile option could be: first,
assuming the spool file is encoded in an 8-bit ASCII-compatible charset
(the actual encoding is irrelevant); and second, feeding it as is to
the SAM, without translation (of course beside end of line and maybe
other control characters).

Marcos

-- 
http://programandala.net


Re: text spooling

2012-11-16 Thread Simon Owen
On 16/11/2012 16:15, Marcos Cruz wrote:
 the text is converted before auto-typing it. Isn't it? I mean
 non-ASCII characters.
   
Yes, in the simplest case it's just to map £ and © to the special codes
needed for SAM use, and to drop CR but convert NL to CR.  Though I
also use the Win32 API to do a more thorough transliteration to the
closest ASCII equivalent (mostly stripping diacritics).  I also added
manual transliteration of Cyrillic characters to attempt to preserve
comments in a batch of Russian BASIC listings.

For the other ports I was planning to use iconv to do the main
transliteration step.  Under Linux iconv (part of libc-bin) appears to
include the support I'm after.  Mac OS X is still using the traditional
libiconv, which gives strange results with the accents separated out
(coupé - coup'e).  If I can't find a quick and easy solution for
that I'll just drop transliteration support.  I'd rather spend my
SimCoupe development time on emulation, not text conversion!


 But I think automatic translation during spooling has some drawbacks:
 first, it would be useful only for non-ASCII characters (mainly, foreign
 language letters) provided by the SAM (mainly, by MasterBASIC) --or for
 all characters, in case the file is encoded in any ASCII-incompatible
 format, e.g. UTF-16, what is not common;
That uncertainty is the reason for only using the clipboard in Windows
-- I get the content in Unicode, so I don't need to guess the character
encoding.  It appears SDL 2.0 (still under development) will be able to
provide the clipboard contents in UTF-8, which should give similar
results when combined with a working iconv.


 character translation done by the programmer in the source.
   
There will always be special cases, particularly for that kind of
private encoding scheme.  I think it'd be best to require the user to
convert the text before spooling, with automatic translation disabled in
SimCoupe.


 Therefore, in my opinion, a simple and versatile option could be: first,
 assuming the spool file is encoded in an 8-bit ASCII-compatible charset
 (the actual encoding is irrelevant);
Assuming spooled text files are UTF-8 on non-Windows platforms might
give a similar success rate to assuming iso-8859-1 under Windows.  I did
implement file spooling in the Windows version, but I felt the character
coding uncertainty would generate poor results and too many support
e-mails, so I took it out.  It's tempting to do the same for non-Windows
platforms, but I might give it a chance.


 and second, feeding it as is to the SAM, without translation (of
 course beside end of line and maybe other control characters).
   
I'll probably add options for no translation, minimal translation, and
full transliteration.  Hopefully even without that help the spooling
process will still make your life a bit easier :)

Si



Re: text spooling

2012-11-16 Thread Thomas Harte
On 16 November 2012 10:38, Simon Owen simon.o...@simcoupe.org wrote:
 For the other ports I was planning to use iconv to do the main
 transliteration step.  Under Linux iconv (part of libc-bin) appears to
 include the support I'm after.  Mac OS X is still using the traditional
 libiconv, which gives strange results with the accents separated out
 (coupé - coup'e).  If I can't find a quick and easy solution for
 that I'll just drop transliteration support.  I'd rather spend my
 SimCoupe development time on emulation, not text conversion!

What's your policy on native code? I ran a quick test with:

NSString *testString = @Coupé;
char terminator = '\0';
NSMutableData *asciiData = [[[testString
dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES]
mutableCopy] autorelease];
[asciiData appendBytes:terminator length:1];
NSLog(@%@ %s, testString, [asciiData bytes]);

Output was: Coupé Coupe

cStringUsingEncoding: directly on the string didn't do the job sadly
since it doesn't permit a lossy conversion and there's no direct
method for a lossy conversion with a C-style terminator.


Re: text spooling

2012-11-16 Thread Marcos Cruz
En/Je/On 2012-11-16 18:38, Simon Owen escribió / skribis / wrote :

 I'll probably add options for no translation, minimal translation, and
 full transliteration.  

That would be great.

Marcos

-- 
http://programandala.net


Re: buggy KEYIN; MasterBASIC's tokenized format; buggy RENUM

2012-11-16 Thread Tim Paveley

My guess is that the final label might be at some kind of
page boundary, which trips up the code building the table.  I haven't
tried to look into it -- any volunteers...?
So this may or may not be related but I've memories of long basic 
programs getting corrupted, and the corruption would happen around a 
page boundary.  I'd end up putting long REM statements around the area 
affected.


I've found an example in the fortress code on the sad snail 
collection.  There are a bunch of REMS around line 40250.  I'm pretty 
certain if I could remember how to check this will be at a page boundary.


HTH,
Tim


Re: buggy KEYIN; MasterBASIC's tokenized format; buggy RENUM

2012-11-16 Thread david

Quoting Tim Paveley u...@samcoupescrapbook.co.uk:


My guess is that the final label might be at some kind of
page boundary, which trips up the code building the table.  I haven't
tried to look into it -- any volunteers...?
So this may or may not be related but I've memories of long basic  
programs getting corrupted, and the corruption would happen around a  
page boundary.  I'd end up putting long REM statements around the  
area affected.


I'm sure I had this issue with the Newsdisk Basic text viewer a few  
times... and very sure I had the same problem when I used similar code  
for the Blitz diskzine. (SAM Prime used a special text editor from  
Nigel Kettlewell so it didn't have that issue)