[ilugd]: A suggestion for common architecture for development & deployment platform

Rajagopal Iyer Fri, 06 Dec 2002 07:40:47 -0800
hi all,

I am Rajagopal S. Iyer based in Mumbai.

I am heading the IT department of a leading construction company with around 100 node, seven server LAN and about a dozen clipper based legacy application.

I have a total of 18 years experience in S/W, H/W and networks ranging from old z80 to minis and mainframes.

I am attaching file in which I have oulined a common architecture for Open source application development and deployment platform. This is an outcome of a study conducted by me.

It is proposed as a master blueprint which addresses various areas of computing.

Please note that many of the technologies are already available. It is just a matter of combined effort to put them togather.

I welcome any and every question, doubt, improvement and constructive, validated criticism on this text.

I know it contains some theoratical portions. some are to be verified mathematically.

I suggest that people read the whole document once and then relevent sections that interests them.

I request all to look into the concepts presented rather than the names given to them.

Thank You and Regards,

Rajagopal S. Iyer
AUM
namo namaH
shrii gurubhyo namaH
hariH AUM


yajur aNOTHER jOINT uNDERTAKING rESULT - veda eXTENSIBLE dISTRIBUTED
aRCHITECTURE


Proposed Computing Environment
By Rajagopal S. Iyer, Thane, Mumbai. Phone: (9122) 547 3884
(Sr. Co-ordinator, Facilities, ISD, Lok Group)
e-mail :  [EMAIL PROTECTED]

Copyright (C) 16th February 2002, Rajagopal S. Iyer. All rights Reserved by
the author.

Organisations and People who run it don't require computers:
       They require their familiar, private,
        powerful, secure computing environment 
        anywhere, anytime.
 
This architecture keeps the human in the centre of the design. Care has
been taken to ensure that Most of the ideas mentioned here can be
implemented with off the shelf technology with realtime acceptable
performance for the user.

A Vertically integrated machine with this architecture is not in existance
but can be built by doing modifying various underlying technologies.
 
vEDA eXTENSIBLE dISTRIBUTED aRCHITECTURE (veda) is proposed for very high
availability and location freedom and absolute privacy.

The proposed architecture
 
 A Unified Machine (AUM) architechture is proposed

Overall architecture:
 
 The layered architecture is described below:
 
The selection is based on ready availability of necessary hardware &
Software. Many of the proposed channels are already functional and
available under GNU GPL or other open source licence.

Storage Pyramid: (all serving as network data reservoir)
  Registers (0)
  Cache (1)
  RAM (2) - Storage for programs
  RAM (3) - as a Cache for devices below
  HDD (4)
  Optical R/W & R/O disks(5)
  Magnetic tape (7)
 
 CPU/Kernel Pyramid (Only two states: Running and housekeeping)
 
 It is 128-bit word architecture out of which 120 bits for data and 8 bits
 for ECC.

Error Checking and correction algorithm is based on a simple binary tree
representation and a small recursive function which terminates when two
immediate neighbours are found.

Running implies that the CPU continuously switches between four states
preemptively in real-time

Run Level 0: (Kernel) (Ring 0)
 
   IPC Media : registers
 
   Line frequency synchroniser
   Premptive Task Switcher
   Memory Manager
   Device Drivers
   Heartbeat Generator/ Responder
   Network communicator local/remote
   real time calender (cosmological)
 
 Run Level 1: (Two Tasks only)
 
   IPC Media : Cache Memory
 
Each task will listen and process from exactly one input channels and give
output to exactly two channels. those two channels could be any.

   Virtual Machine 0 & 1 (Takes one input and gives two outputs - Local & network)

     Input Threads: (two only)
        Check & Listen to Local Input Device (core dump errors)
        Check & Listen to Network Input Devices (core dump errors)
 
     Process Threads (heartbeat core dumps to local storage)
        Take data from Input device (core,log dump after Checking)
        Prepare transfer of data (Presentation & core,log dump)
        Output Data to output devices (after Checking Tee to n/w and local  storage)
 
     Output threads
        Check & Talk to Local Output Devices (core dump errors)
        Check & Talk to Network Output Devices (core dump errors)
 
 Run level 2 (Two Tasks Only - Graphics processing)
 
    IPC Media : RAM
 
    Typical tasks (high bandwidth requirement)
 
    Input       = Images / Sound
    Process     = Image/High bandwidth signals
    Output task = Display Update
 
 
 Run Level 3 (Two tasks only - Structured Text / Voice processing)
 
    IPC Media : RAM
 
    Typical Process (Voice processing, Structured Text processing)
 
    Input       = Sound Device
    Processing  = Uses ITRANS/ other algorithms
    Output      = Sound Device
 
 Run level 4 (User Interface)
 
 Typical Applications: User authentication
                      Configuration Selection
 
    IPC Media   : HDD
 
    Input       = Composite
    Processing  = Transaction
    Output      = Composite
 
 Run level 5 (long term processing - User Applications)
 
    IPC Media : Replicated file system
 
    Input   : Messages from user in composite media
    Process : Composite processing using all lower run level facilities)
    Output  : Messages to User in composite media
 
 
Primitive Data types :
 
Numbers:

 only whole numbers (unsigned 128 bit integer)
 
   64613997892457936451903530140000000 = 2^119
 1329227995784915872903807060280000000 = 2^120 (Max)
 
 no floating point representations
 no negative numbers representation in bare machine
 Zero represented by all zero bits
 Only closed infinity representation (all 1s) (affine or closure?)
 
Text: 
  Plain Ascii

Graphic:

For the simplicity sake of this architecture outline, A ring of primary
colour Red Green Blue and White as four points are assumed. The ratio is
somewhat similar to Television standards colour ratio for white and then
twisting it a bit to suit to our prime requirement of simplicity and
practicality

Only Two colour spaces, RGBW and CMYB, are considered.

Red, Green, Blue & White for display devices
Cyan, Magenta, Yellow, Black for Print Devices

The reason for adding white int RGB is for white temperature of 6500K.

As the largest 

primitive definition (square limits are assumed as computationally it is
easier on my brain):

   Raster: Each Pixel has 120 bits allocated for it for each colour.
                 bit allocation for RGBW system are:
                   12 for Red
                   75 for Green 
                   13 for Blue 
                   20 for White
                    8 bits for ECC.
           (To be used as Data in Device Space)
                    
   Vector: 57 bits : 19 per dimension (x,y,z) start point (524228 mm In device space)
           57 bits : 19 for relative co-ordinates (x,y,z) from start point
                     (all zero here can be treated as a point)
           6 bitsRGB colour for each starting & ending points.
           8 bits for ECC

           Black or Non existance is all zero which is anyway not needed.

           All 1s will indicate a white line from begin to end point (ECC excluded).

           White or Black component not needed in pure colour space as the
           temperature, as percieved, could be essentially noise caused
           due to superimposition of the various waves from the invisible
           electromagnetic spectrum.

           For clarity sake the data for coordinates may be represented by
           Pure 7-bit ASCII at a higher level as each point at the most
           may require 22 bytes. (7x3) + 1 for colour

           (known light speed 300,000,000,000 mm/s)
           (To be used as Address in the device space)

( Comment on Practical Dimensions:
524 meters in X Y and Z dimensions
   > 1/2 KM media?  --
   Devices are still not available! )

Graphic Primitive Processing:

Processing is by reading the co-ordinate and doing a binary recursion
as specified above for either vectorising or rasterising depending on the
device.

Start at device Origin, Recurse along the X axis, at the end of device
limit, switch axis to Y and then in Z axis, Return to Origin.

Graphic Computational speed & predictability: As the binary recursion
algorithm has the known termination condition under all circumstances, it
can be safely said that rendering for worst case of two adjecent pixels
P0(0,0,0 - R1G1B1) and P1(0,0,1 - R2G2B2), only the end points needs to be
taken and the maximum iterations will be 6 * 13 = 78 iterations.

In the proposed architecture it will take 78 clock cycles to render one
graphic primitive primitive.

Worst case will be one bit change in each of the succesive pixel.

Proof of Efficiency of algorithms are to mathematically proved. O(n) to be
obtained mathematically.

{-- Proof? who? me? No, Sorry! I am mathematically challenged :-) }

Network Layers (OSI based - Total 7)
 
 Physical (7 total):
 (Criteria: Speedwise and vicinity wise decreasing order)
   CPU-Internal-Registers (CPU/memory Bus(0)),
   CPU-RAM / Display path (1),
   CPU-System Bus (Sound card video capture) (2),
   Disk bus : SCSI, fibre channel (3)
   Ethernet (4),
   IR/Wireless (5),
   Serial Port (6) (allows use of local phone lines for PPP)
 
 Network:
 Ring 0 = Register
 Ring 1 = Cache
 Ring 2 = RAM
 Ring 3 = RAM
 Ring 4 = HDD
 
 Transport:
 
 Presention
 Translation????
 Application
 
 Scheduling algorithm.
 
 There are three states to this machine:
 
 0. Running
 1. Preparatory
 2. Wait
 3. Review
 4. Reorder Priority
 5. Cycle complete
 
 
The time slice ratio for these recommended are 20, 75, 12, 13 (total 120)
clock cycles respectively
 
 0. Running
 
   In this state the process is actually run
 
 1. Preparatory
 
  In this state the necessary resource allocations for running the process is prepared
 
  1. Read the Process Table
  2. allocate necessary resources
  3. signal to run the process
 
 2. Wait State
 
 This is the stage the machine execute the heartbeat function which consists of

  1. read any error status from the heartbeat messages of other nodes
  2. prepare the status report
  3. write error / log messages in the respective locations
 
 3. Review
 
    In this state, the messages after running of the process and the necessary
    steps for transmitting the messages to the next process are taken

 4. Reorder states
 In this stage, the states of the two processes are altered
     (If the state of the process 0 is 0123 then 1032
          and of the process 1 is 1032 change to 0123)
 
 5. Cycle Complete state
 
In this state the Necessary logging of Machine state and heartbeat, Cycle
No Detail Stamping is to be taken care of (Fill details here)
 

 6. State Increment State
  This is the crucial state where the promotion of state is done.
 
 7. respawn state
   process with incremented states will launch itself ??? (Fill in properly)
 
 In between these state is the idle state.
 
 The suggested scheduling is sequence (In the Real time)
 0-->1-->1-->0-->0-->1-->2-->2-->1-->0-->0-->1-->2;
 1-->2-->2-->1-->1-->2-->3-->3-->2-->1-->1-->2-->3. (VM0)
  |
  v
 4
  |
  v
 0-->1-->1-->0-->0-->1-->2-->2-->1-->0-->0-->1-->2;
 1-->2-->2-->1-->1-->2-->3-->3-->2-->1-->1-->2-->3. (VM1)
  |
  v
 5
  |
  v
 6
  |
  v
 7
 
(This is binary recursion with the termination condition that both the
neighbours of the tree are found.)

 Directed graph representation of states
 
        (Root)
          (0)
         /   \
       (1)   (2)
               \
               (3)
 
If the inversion of 0 & 1 states are not done the machine will enter a be
running uncontrollably with the binary recursion algorithm that is used for
task scheduling.
 
The head-inversion causes two breaks. The task is deemed completed only
after machine enters the Idle State.

In fact this data structure and processing can be replicated in the upper
rings which will result in predictable performance.

Performence factors are to be worked out mathematically.
 
 Implementation, & Availability Issues
 -------------------------------------------------

the two Ring-1 tasks are two identical clustered virtual machines using
MOSIX or any other similar Kernel pathces.
 
The clustering of one box with another is thru SCSI or System bus. (so
there can be a single box high performance Cluster).

Fibre channel/100mbps n/w media is the second option.
 
Each machine is identical to each other in function. they communicate using
Registers as network media.
 
The proposed machine's will a Replicated, jounalled, file system.
 
Where two disks are available, RAID Technology should be used.
 
There is only one login and two directories for each person.
 
The network is Purely a private network (IP address Space) with network
path to Internet wherever necessary.

It is envisaged that, by carefully planning each node's IP address and
their neighbours, We will never run out of Private IP Address. The only
data flowing in this network (if all participating machines are according
to this architecture) will be 7-bit ASCII text
 
The the most secure network protocol with PGP signature is suggested for
base configuration.

The User Authentication / configuration management is to be managed with
LDAP. This include machine specific optimised utilities for highest degree
of interoperability.
 
All the network listen channel should act as an input device (stdin).
 
All the network talk channels should act as an output device (stdout).

All the network error channels should act as  on error device (stderr).

 Hardware:
 
 The machine should be able to use the following as the network media:
 Register, Cache, System Bus, SCSI, NIC, Serial port, Parallel port,
 Sound (can be picked up thru System Bus), Optical & Wireless
 
 The machine should be able to draw energy out of following sources:
 Electrical, optical, Solar, chemical, Sound (if possible)
 
 The network and energy source path can be the same.
 
Each of the machine will have a compute node, storage node and varied
network paths as described above.
 
Each node participating in this computer will have exactly one path for
talk and listen. It should be able to Talk/listen through any of external
network port to any other machine.
 
All the network node will have fixed addresses for each network port.
 
The base machine itself will have One IP address
 
Networking protocols supported will be TCP/IP. Optionally IPX/SPX and NCP
may be supported.

Network Environment

As the Machine is always in listen mode for message and talks only when
required, the network bandwith requirements are based on three parameters
associated with the message processing:

 1. Path
 2. Quanta
 3. Frequency

The aim is to select path in such a way that the product of quanta and
frequency, (which gives the total message size) is transmitted through the
network physical layer within user-acceptable time frame.

The most common information formats are (in decreasing order of bandwidth
requierement):

1. Video
2. Sound
3. Structured Text
4. 

Cosidering the common applications a correspondance table can made as
follows:



Network environment supports 






System environment
 
GNU/Linux is suggested for is immense scalability and flexibility and most
importantly configurability and its case sensitivess to file names as it is
the heart of ITRANS encoding scheme for system filenames.

All the networking code should be optimised into kernel.
 
This machine will have at the primitive instrauction which will emulate the
processing as described above.
 
The base software which is suggested EMACS in its various incarnations as
the self recursive nature of self-insert-command is what makes it the most
extensible software.
 
 CVS at a lower level will ensure automatic journalling of all changes: textual and 
binary

It is proposed that this machine will use ITRANS envelope for
self-insert-command. This will ensure that the machine will be able to
translate Voice into ITRANS encoding. As Sanskrit is a phonetic language,
and ITRANS a viable and very practical representation of phonetics, A user
will just have to record the basic letters of sanskrit that will be stored
for voice reproduction. The messages can then be displayed and user's
interpretation of the message is recorded and stored as a parameter "mother
toungue" of the user.
 
 This too can be embedded in kernel with a little bit of effort.
 
ITRANS processing should also be part of kernel. This will reduce the
bandwidth requirement for data as only encoded text is required to produce
speech. (ITRANS processing does not require backstepping). The phonetic
ITRANS atom is max. 3 characters.
 
The Cosmological time recording system should be the base time machine for
all time processing. Already Open Source libraries are available for this.
(Solar, Lunar, Planetery positions, Phase of moon, galaxy coordinates etc.)

conversion routines for various current time systems should be there.
 
The name space (LDAP or PostgreSQL Implementation)

Transaction Engine

A Unit transaction is defined as recording of the changes related to an
entity caused by an internal or external event reliably.

The transaction model is essentialy an event driven model.

A single table - Event log - is to be maintained which captures all the
external and internal events

Event Master table will enumerate all the possible events at the given
level of operation.

Name Master will enumerate all the possible names that can be encountered
in the system. Each name will have backward and forward pointers. Backward
pointers for Source of Name, Forward pointers for transformed entities. It
will also have an alias property referring to its alias in the same table.

Data model for transactions

It essentially consists of two self referential entities.

This model yields itself to self learning feature of EMACS which is
intended as the primary front end.

In a parametrised form, the data model can be viewed as given below

(In Hierarchical Model -- Inverted Tree)

An entity can be oen of the three types:
     Human or System or Physical

Corresponding Human Entities are:
     Person / Organisation Unit / Organisation

System Entities are (all are goals -- Desire):
     Functional /  Organisational Unit  / Organisation 

Functional Entities are:
     Person / Organisational Role / Processes

Physical entity can be  Person or Hardware or Message

An entity is located at: 
      Physical or Document or Network

A message can be
    Requirement / Status / Event

Messages in the Physical forms
    verbal / Internal / External (Images/Document)


Requirement will arise through expression and when recognised, will go
through following four states in the order specified on each event type as
shown below:

           (C)       (M)       (T)  
    Desire ---> Want ---> Need ---> Necessity

Status state diagram will be the following

              (C)               (M)         (T)  
    Initialisation ---> Running ---> Result ---> Logging

Event State Diagram will be:

    Creation ---> Modification ---> Transformation ---> Wait
       ^                                                 |
       |                                                 |
       +-------------------------------------------------+


A matrix of the above could be formed to obtain master table and
transaction tables.

It should be an orthogonal matrix to arrive at precise relations.

at the lowest level,each Name has three properties:

1. Space Value
2. Money Value
3. Time Value

Event Model

A message triggers one or more or a combination thereof of the three
events:

1. Creation
2. Modification
3. Transformation (into another name)

The events Themselves are to be logged seperately in the audit trial table
or event log table.

These events causes insertion into the Transaction table with a time/location stamp.

Any modification of any entity will cause also cause Transaction insertion.

Transformation causes the change of name and an entry in the transaction
table. After transformation the original entity ceases to exist for further
transactions.

Reports:

Any kind of Report can be generated using a universal cross tabular
reporting function from a normalised table of transaction. EMACS can be
tweaked to be a report writer after EMACS isearch is modified with suitable
atoms. (minimal effort..Lazy way out)

In fact all the data may be stored in pure text as EMACS is Extremely good
at handling pure text.


Development tools:

A suitable Development tool to define the Data model can be suitably
designed for easy access of Data

+++++++++++++++++++++++  REVIEW THOROUGHLY the following +++++++++++++++
(insert, update and delete thoughts only here)

All the entities will have the followiong common attributes

 1. Name
 2. Location
 3. Location timestamp
 4. Type (Person)
 5. Value Timestamp
 6. Value (in Number)
 7. Creation timestamp
 8. Last Access begin timestamp
 9. Last Access end timestamp
10. Last Modify begin timestamp
11. Last Modify end timestamp
12. Transformed to (target entity)
13. Transformation timestamp


In the above scheme every entity has three phases:

1. Comes into existance at a given location at a given point of time with a
   given value at that point of time.

2. Its Value, location and other attributes may get modified over time
   which are recorded in the database

3. It finally gets transformed into another Name (after this go to step 1)
  
A message is recorded along with every event that happens.


A database of the following are to be available always:

  1. Name of the entity
  2. Value of the entity
  3. Space of the entity
  (Add remaining database details here)
 
 A transaction engine which emulates the kernel in processing transaction.


Transaction Engine will track Name, Message, Space, Value, state
simultaneously and time / location stamp them.

Each Transaction has four States:  Initiation, Running, Sustanance and Perpetuality

Each Message/Value transfer has Four State: Dire Needs, Actual, Key (for
what???), Maintenence (of What?-- specify).

 Every transaction has four states at which money/material exchange occurs:
    1. Initation
    2. Actual
    3. Result
    4. Maintenance
 
 (Put a good deal of explanation text in here)
 
 (

a good begining would be a template in c++ taking a token of any data type
as the argument, allocate memory and return a pointer to the that token.
well I can't think further .... :-) 
  
   int CreateData(<RetType> a)   {
     <RetType> (typeof(a) malloc(NULL, a, sizeof(a))
  }

...... or whatever... hope u get the point rather thatn syntax

)

+++++++++++++++++++++++  END REVIEW THOROUGHLY ++++++++++++++++++++++++

Hard copy printing Environment.

It is envisaged that Four primary drivers must be made available:

   0. Pure Text
   1. Epson 9 and 24 Pin emulation
   2. HP PCL Emulation
   3. Postcript
   
Language support Environment

It is necessay that a minimum of following character sets

   0. Plain ASCII character Set (7-bit)
   1. Unicode
   2. Devnagari
      (San98 TTF or Xdvng with suitably modified glyphs for ITRANS encoding)

With glyphs in
   0. GUI Mode
   1. Character mode

in either Hardware / Software form is made availble



 User environment
 ----------------
 
 Each user will have exactly one login and one root directory.
 
 The owner of the directory decides the physical location of data
 
The system will have on-demand appearence of file on invocation of a
command (either timed or user decided)
 
It is recommended that the user have two top level storage domains:
    Work (Organisation controlled access)
    Personal or Pleasure (User Controlled access)

  Rest all can be Organised below these two domains with relevent symlinks
 
Application visibility is based on the User profile in the authentication
System.

This authentication System can then be suitably integrated with
certification authority for B2B and other Internet based secure application
requirements. A private port can be leased.

The Machine fingerprint is stored in the LDAP. Based on the fingerprint the
most suitable image from any of the nearby machine will be loaded.

Training Isssues

 Primary user training will be on Touch Typing (gtypist) and EMACS in character mode.

 Only incremental training will then be required

(Except for exceptional cases pointer device may not be required for most
of the usual user activities under this architecture as mouse is deemed as
a time waster / disruptive technology. Hence GUI Training takes a backseat.)
 
 Application training will be specific depending upon user profile.
 
The user can use Character or Graphical Interface depending upon the
available hardware at that point of time and machine
 
 The typical applications the Base users will need are:

    Document Processor (LaTeX)
    Text Processor (EMACS)
    Spreadsheet (SC or any better as required)
    Internet Browser (W3)
    E-mail (rmail of Emacs/mutt/pine)
 
    Printing to any printer from above application

 All the above application will be in character mode
 
 For slightly advanced users:
    All of the above in GUI and Character Mode.
    Bitmap Image Processor
    Vector Image Processor
    Presentation package
    Sound Recorder / Editor / Listner
    Video Recorder / Player
 
 
 For Developers:
    All the above and CVS Access
    All the necessary development tools
 
 
 For system end:
     LDAP (Authentication)
     PostreSQL or SAPdb (or similar RDBMS) for Data Storage (optional)
     A Transaction Engine
     A Time Engine
     A Logger Engine
     A Location Engine (with Lat/long/height & distance calculator)
[ilugd]: A suggestion for common architecture for development & deployment platform

Reply via email to