Re: [Simh] New simulator - VAX-11/782

2017-05-25 Thread Sergey Oboguev
Matt Burke  wrote:

> I probably should have researched memory barriers a bit more. 
> I knew a little bit about them but wasn't sure if they were needed here. 
> The problem may also exist for the rest of the shared memory.

For the purposes of running VMS, what really and ultimately matters is not what 
actual 78x hardware does, but what VMS expects it to do. And this latter bit 
can be learned from VMS sources, even if we do not have (and may be unable to 
obtain) sufficient knowledge about the hardware. 

Precise understanding of VMS expectations would require a more thorough look 
into VMS sources than I could have until now, however my hunch is that VMS 
desires are a subset of a stronger general statement: 

[SS:] "When VCPU1 and VCPU2 [in a shared-memory or cache-coherent 
multiprocessor] communicate via an IPI from VCPU1 to VCPU2, VCPU1 [most often] 
wants to pass its view of memory to VCPU2".

VMS may quite possibly expect less than that, however we do not (yet) know 
exactly what... it might take a night or more with the listings to understand 
the code well enough to figure this out exactly. (Comments in the code next to 
BBSSI/BBCCI instructions suggest that VMS uses them purposefully to flush and 
control the cache "manually"... hence my questions in the previous message.)

Also this "less" might turn out to be more difficult to implement in practice 
than (both stronger but comparatively simpler) SS.

My *hunch* however is that if SS were in place, VMS will most likely be happy 
with it. There is a slim chance this might prove wrong, only code reading might 
tell, but I think this chance is pretty slim... I'd place money on it. Thus 
assuming it for now:

SS is easy to implement. There is a flag that CPU1 sets and CPU2 reads, to pass 
an IPI. If this flag is protected by a lock both on read and write sides, this 
would ensure SS, since locking primitives issue memory barriers (and compiler 
barriers as well -- another important thing to have in mind).

Right now ipc_send_int uses ipc_lock/ipc_unlock but ipc_poll_int does not. If 
ipc_poll_int used locking (matching one in ipc_send_int), that would provide 
memory barriers (along with compiler barriers) and provide SS.

Alternatively, it is possible to insert WMB (or full MB) primitive into 
ipc_send_int before setting an IPI flag, and insert a RMB (or full MB) 
primitive in ipc_poll_int after reading the flag, but these are 
platform-specific. (RMB and WMB designations here also imply compiler barrier 
included, on both sides of a hardware barrier). VAX MP out of necessity has an 
implementation for smp_rmb() and smp_wmb() that also include compiler barrier 
(barrier(), COMPILER_BARRIER), but you do not really want to get into this host 
platform specific (and for X86, also host CPU model specific and bitness 
specific) mess unless really necessary... and for 782 it is really not. Using 
locks is a much neater solution for 782 purposes, I think.___
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh

Re: [Simh] New simulator - VAX-11/782

2017-05-25 Thread Paul Koning

> On May 25, 2017, at 6:51 PM, Matt Burke  wrote:
> 
> On 25/05/2017 04:03, Sergey Oboguev wrote:
>> Superficially looking at (AS)MP VMS code, it appears that the following 
>> should (hopefully) suffice for correct operation:
>> 
>> 1. BBSSI and BBCCI should acquire a lock when accessing the memory location. 
>> A simplistic implementation may use one lock for the whole memory (or the 
>> whole MA780 memory bank). A more sophisticated implementation may use a 
>> bucket of locks, with a particular physical address within an MA bank 
>> mapping to a corresponding lock in the bucket (with a lock being shared by a 
>> range of MA physical addresses) -- but that would probably be an overkill 
>> for 2-CPU config which is not particularly heavy on synchronization.
> 
> My plan was to use just one global lock which would be set on the read cycle 
> and cleared on the write cycle.

If you have access to suitabl atomic operations, you shouldn't even need a 
lock.  That would be nice.  

paul


___
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh

Re: [Simh] New simulator - VAX-11/782

2017-05-25 Thread Matt Burke
On 25/05/2017 04:03, Sergey Oboguev wrote:
> Superficially looking at (AS)MP VMS code, it appears that the
> following should (hopefully) suffice for correct operation:
>
> 1. BBSSI and BBCCI should acquire a lock when accessing the memory
> location. A simplistic implementation may use one lock for the whole
> memory (or the whole MA780 memory bank). A more sophisticated
> implementation may use a bucket of locks, with a particular physical
> address within an MA bank mapping to a corresponding lock in the
> bucket (with a lock being shared by a range of MA physical addresses)
> -- but that would probably be an overkill for 2-CPU config which is
> not particularly heavy on synchronization.

My plan was to use just one global lock which would be set on the read
cycle and cleared on the write cycle.

>
> 1.2. VMS itself does not appear to use anything other than BBSSI and
> BBCCI in the ASMP code. However applications or libraries using the
> multiprocessing may, so for their sake the same applies to other
> interlocked instructions as well. Those applications or libraries
> might also conceivably use a higher rate of locking (justifying the
> bucketing of locks in this case) -- but do they even exist in the
> first place?

Chapter 4 of the VAX-11/782 User's Guide recommends the interlocked
instructions for user written code so they all need to be supported. We
really need the MA780 technical description of the field maintenance
print set to understand how it handles the read-interlocked SBI cycle.

>
> 2. When sending out an IPI, the sending VCPU thread should execute a
> write memory barrier right before writing to the interrupt register.
>
> 3. When receiving an IPI and before handling it, the receiving VCPU
> thread should execute a read memory barrier matching the barrier in
> (2). An obvious implementation would be for (2) and (3) to acquire a
> lock on the "interrupt pending" register of the CPU that is the target
> of the IPI.

I probably should have researched memory barriers a bit more. I knew a
little bit about them but wasn't sure if they were needed here. The
problem may also exist for the rest of the shared memory.

>
> As is always with legacy MP code though, it is a bit of a gamble.
> Modern host processors have different cache coherency model than that
> of the 780 CPUs. Thus it is possible for some sequences that worked on
> the 11/78x multiprocessor to start failing when emulated on x86 or
> other contemporary host CPU. Only a detailed review of the code with
> respect to the cache coherency assumed by the code can tell.
>
> But do we even know how the 780 cache operates?
> Is it write-through or lazy writeback?
> Do interlocked instructions (such as BBSSI/BBCCI) invalidate the 780
> read cache?
> Do they commit pending writebacks from the cache to MA780/main memory
> (MS780) before the instruction completion?

Here is extract from the VAX-11/782 User's Guide that partly answers the
question:

"Each MA780 shared memory subsystem should have the cache invalidation
map option. This option reduces traffic on the Synchronous Backplane
Interconnect (SBI) by reducing the number of cache invalidate requests
sent to each processor. By keeping track of which locations in MA780
memory have been placed in the cache of each processor, the option
allows cache invalidate requests to be sent only to the processor(s)
whose cache contains the location that has been invalidated."

So it looks like the port invalidation control register contains a mask
of the SBI nodes that need to have cache invalidate requests sent to
them. The ASMP code sets the bit for nexus 0 (CPU) as part of the
initialisation.

Matt
___
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh

Re: [Simh] New simulator - VAX-11/782

2017-05-21 Thread Matt Burke
On 21/05/2017 15:54, Tim Stark wrote:
> Interesting! Thanks for let us know about MP version of VAX-11/780. Will it 
> be 787/789 emulation (VAX-11/785)? How about 784 (4 processors)? I recommend 
> atomic variables for interlocking. Check C++ reference for more information. 
> It requires least C++11 revisions (least GCC version 4.7). Also it now 
> provides threads internally. I will use them in my MSE emulator. 
>

Yes, it covers the 787/789 too. Just change the CPU model from 11/780 to
11/785:

sim> set cpu model=785

This VAX-11/784 is also covered although there seems to be a few issues
here that I need to investigate. Note that VAX/VMS does not support the
VAX-11/784 for multi-processing. You have to write your own code to make
use of the MA780 shared memory.

The sim_ipc module can provide locking for the shared memory. The bit
that needs working out is how to dispatch to ipc_lock for the
read-interlocked memory accesses. I have some ideas about how to do it.

Matt
___
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh

Re: [Simh] New simulator - VAX-11/782

2017-05-21 Thread Matt Burke

>> On May 20, 2017, at 12:00 PM, Paul Koning  wrote:
>>
>> Interesting.  So you have the two CPUs as two processes?  I wonder if doing 
>> them as threads in a single process might be more straightforward.  I did an 
>> implementation of dual CPU CDC 6000 emulation that way (an extension to Tom 
>> Hunter's DtCyber).  Posix threads (pthread) work nicely, and semaphores 
>> (which aren't strictly part of pthreads but are often found alongside such 
>> implementations) often come in handy as well.
>>
>>  
Yes that's correct. The reason I implemented the VAX-11/782 in this way
is that it more closely replicates the arrangement of the physical
hardware. The real VAX-11/782 is two separate VAX-11/780 systems and the
only connection between them is the MA780 shared memory. For the
VAX-11/782 multi-processing configuration, the secondary VAX-11/780
(attached processor) boots from the shared memory and does not interact
with the console. For the VAX-11/784 each VAX-11/780 system boots from
it's own system disk and interacts with the console normally.
Multi-processing is then achieved via user written code that interacts
via the MA780 shared memory. It would be difficult to simulate the
VAX-11/784 via threads. The other reason for this choosing this
implementation is that it requires minimal changes to the existing VAX
simulator code.

Matt
___
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh

Re: [Simh] New simulator - VAX-11/782

2017-05-21 Thread Paul Koning

> On May 21, 2017, at 10:54 AM, Tim Stark  wrote:
> 
> Interesting! Thanks for let us know about MP version of VAX-11/780. Will it 
> be 787/789 emulation (VAX-11/785)? How about 784 (4 processors)? I recommend 
> atomic variables for interlocking.

Instead of relying on exotic C features, it might make more sense to use 
pthread primitives (mutex etc.) or semaphores.  Those are likely to be more 
generally supported.

paul

___
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh

Re: [Simh] New simulator - VAX-11/782

2017-05-21 Thread Tim Stark
Interesting! Thanks for let us know about MP version of VAX-11/780. Will it be 
787/789 emulation (VAX-11/785)? How about 784 (4 processors)? I recommend 
atomic variables for interlocking. Check C++ reference for more information. It 
requires least C++11 revisions (least GCC version 4.7). Also it now provides 
threads internally. I will use them in my MSE emulator. 

Yeah. I am aware of dtcyber emulator but it is useless due to lack of NOS 
software available. I was looking for NOS through Internet but can't find it. 

I have cray-1x emulator with COS 1.17 software (latest version). I heard that 
COS sources are released into public domain for 1.13 but I can't find it yet. I 
am still looking for Unicos software. (UNIX for Cray)

Cray-1x emulator sources were available on code.google.com but Google 
discontinued that service. Every repos on that site went empty.  I did make a 
copy of SVN dump before it was gone.

Tim 

Sent from my iPad

> On May 20, 2017, at 12:00 PM, Paul Koning  wrote:
> 
> 
>> On May 19, 2017, at 6:15 PM, Matt Burke  wrote:
>> 
>> Ok, so now it's time to reveal the new 'device' I've been working on
>> that I mentioned a few days ago. It is in fact the MA780 multi-port
>> memory as used by the VAX-11/782. Simulator source is now on Github and
>> binaries (for Win32) are available on my website along with a guide on
>> how to setup VAX/VMS for asymmetric multi-processing:
>> 
>> http://www.9track.net/simh/vax782/
>> 
>> This simulator relies on a new module 'sim_ipc' which has O/S specific
>> code for inter-process communication. This module should work on at
>> least Windows, VMS and POSIX compliant UNIX variants but no doubt some
>> more work will be required in this area to make it truly portable. There
>> are a few known issues as noted on my website but please try this out
>> and let me know how you get on.
> 
> Interesting.  So you have the two CPUs as two processes?  I wonder if doing 
> them as threads in a single process might be more straightforward.  I did an 
> implementation of dual CPU CDC 6000 emulation that way (an extension to Tom 
> Hunter's DtCyber).  Posix threads (pthread) work nicely, and semaphores 
> (which aren't strictly part of pthreads but are often found alongside such 
> implementations) often come in handy as well.
> 
>paul
> 
> ___
> Simh mailing list
> Simh@trailing-edge.com
> http://mailman.trailing-edge.com/mailman/listinfo/simh

___
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh

Re: [Simh] New simulator - VAX-11/782

2017-05-20 Thread Paul Koning

> On May 19, 2017, at 6:15 PM, Matt Burke  wrote:
> 
> Ok, so now it's time to reveal the new 'device' I've been working on
> that I mentioned a few days ago. It is in fact the MA780 multi-port
> memory as used by the VAX-11/782. Simulator source is now on Github and
> binaries (for Win32) are available on my website along with a guide on
> how to setup VAX/VMS for asymmetric multi-processing:
> 
> http://www.9track.net/simh/vax782/
> 
> This simulator relies on a new module 'sim_ipc' which has O/S specific
> code for inter-process communication. This module should work on at
> least Windows, VMS and POSIX compliant UNIX variants but no doubt some
> more work will be required in this area to make it truly portable. There
> are a few known issues as noted on my website but please try this out
> and let me know how you get on.

Interesting.  So you have the two CPUs as two processes?  I wonder if doing 
them as threads in a single process might be more straightforward.  I did an 
implementation of dual CPU CDC 6000 emulation that way (an extension to Tom 
Hunter's DtCyber).  Posix threads (pthread) work nicely, and semaphores (which 
aren't strictly part of pthreads but are often found alongside such 
implementations) often come in handy as well.

paul

___
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh

[Simh] New simulator - VAX-11/782

2017-05-19 Thread Matt Burke
Ok, so now it's time to reveal the new 'device' I've been working on
that I mentioned a few days ago. It is in fact the MA780 multi-port
memory as used by the VAX-11/782. Simulator source is now on Github and
binaries (for Win32) are available on my website along with a guide on
how to setup VAX/VMS for asymmetric multi-processing:

http://www.9track.net/simh/vax782/

This simulator relies on a new module 'sim_ipc' which has O/S specific
code for inter-process communication. This module should work on at
least Windows, VMS and POSIX compliant UNIX variants but no doubt some
more work will be required in this area to make it truly portable. There
are a few known issues as noted on my website but please try this out
and let me know how you get on.

I suspect the simulation is not totally accurate as there was very
little documentation available to base this on. Does anyone have a copy
of the following manuals?

EK-MA780-TD - MA780 Technical Description
MP-00826-00 - MA780 Field Maintenance Print Set

Matt
___
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh