Raul, I fear you have strayed off path again so as you inferred, I provide a 
very brief background to this.

I am speaking from direct experience as I have specialised in the matching 
engine space since 1992 when I co-wrote the first commercial in memory matching 
engine (called X-Stream).
This is now one of NasdaqOMX’s 3 matching engines (iNet, OMX Click/Genium and 
X-Stream) and X-Stream now runs in over 40 global markets (we implemented the 
first 20 including the Shanghai Stock Exchange, Istanbul, Moscow to name a few).

So there are no “rumours”, unvalidated benchmarks, generalisations or suspect 
implementations, and data sets were (and are) very large.  This is all 
firsthand experience, testing, measurement and optimisation.
We also capture the entire matching history in a tick database (we use kdb+), 
and I have also worked in array languages since 1979 at IPSA where I co-wrote 
Sharp APL/HP with Arthur Whitney (kdb+) so have a fair idea of internal array 
language architecture.

A large equities exchange can have 5,000 or 10,000 listed securities, futures & 
options can inflate that an order of magnitude.
For each security an order book consists of bids and offers, it could be 
anywhere from 10 to 200 rows, but each row is a price tier and could consist of 
10 - 1,000 orders in Time priority order.

Our current platform (MarketGrid) runs one crypto client having an order book 
in Leverage trading that is over 1,000 rows deep (not dissimilar to BITMEX).
Visibility is irrelevant, I am not talking the UI (which is a separate browser 
based application and communicates via its own memory cache to the engine), I 
am talking of the server, the part that does centralised in memory order 
matching for the entire market.

We run the matching engine on Linux servers with anywhere from 32GB to 1+ TB 
physical memory, barely touching the disk with all our own housekeeping, 
written mostly in C (have done since the 1990’s).
The transaction rates operate from 100,000 orders per second sustained matching 
up to 300,000 to 400,000 orders per second on these engines, all fed via APIs 
(FIX, ITCH, OUCH, Proprietary etc).
So we are talking single digit microsecond latency round-trip from the client.  
Some markets run 24x7, others 9-5, 5-6 days a week, it all varies by asset 
class.

C kills C++ for this stuff (simply closer to the metal) and likewise both kill 
array languages.
I have seen efforts by people to write matching engines in array languages (eg 
BITMEX is written in kdb+) and they all fail at performance, barely able to 
consume over 1,000 orders per second.
The key reason is the overhead of the array language to process incoming orders 
arriving each in a single message upwards of 100,000 orders per second.

You can append to arrays in array languages at this rate but you can’t maintain 
indices for optimal search matching (head of list) and it is likewise 
impractical to run shadows of the data.
Cache misses do not impact this matching, it is physical memory and all orders 
are stored as contiguous rows in the Orders table as are other tables.  The 
linked lists provide the only proven way to scale the matching speeds required.

So I stand by my comments, which were a constructive suggestion that an 
“ordering attribute” in an array could be a mechanism to help an array language 
become useful in this space, a I used sorted linked lists in C.
However relying solely on arrays for this kind of performance when receiving 
and processing over 100,000 “events” (orders) per second will not do it.

I won’t comment further on this thread as little point, but suffice to say it 
is always useful to know your audience.

Thanks, Rob

> On 21 Nov 2019, at 4:17 am, Raul Miller <rauldmil...@gmail.com> wrote:
> 
> On Tue, Nov 19, 2019 at 8:00 PM 'Rob Hodgkinson' via Programming
> <programm...@jsoftware.com> wrote:
>> The overhead associated with the multi key indexing you describe
>> below is impractical for low latency matching, and avoided by
>> managing a free space list for bucket re-use.
> 
> Why do you say that? (How big of a dataset does your UI represent? How
> fast is data arriving? And, if it's large, what are you showing to
> your users?)
> 
> With a linked list, you have array elements scattered throughout
> memory, which tends to result in high overhead from cache misses.
> 
> With an array, you have array elements densely packed, which gives
> high cache coherency.
> 
> And, for something like list/array traversal, cache misses are order
> of magnitude more expensive than computation.
> 
> (So it sounds like you might be working off of untested rumor, or
> maybe that your benchmarks did not involve good implementations or did
> not involve large data sets.)
> 
> As a general rule, sorting does not take perceivable time in J unless
> you're working with datasets which are large enough that they would
> take years for one person to enter manually. (Or unless we're talking
> about secondary effects -- if it takes 30 milliseconds to sort a large
> dataset and you're doing animation such that you only have 10
> milliseconds to prepare the next frame of imagery, that's a real
> problem, and getting things to work right there takes some effort and
> some complicating simplifications.)
> 
> Put different: working with these kinds of issues can be tricky, and
> can involve issues that many people are not prepared to deal with.
> Consequently, people tend to favor a working implementation over any
> other implementation (which is fine, but it's a bad idea to generalize
> about reasons when you cannot adequately observe what you're working
> with).  But we also get people repeating ideas (and working off of
> ideas) which don't actually play out in practice.
> 
> On the other side of things, though, there's another issue to think
> about when working with J, which is the single threaded character of
> the implementation. Ideally, for UI work and significant computation,
> you want the UI to work independent from the computational engine --
> the reason is that machines fail often enough and in so many different
> ways that people need to be reassured that the thing is still working.
> So, with J that might mean something like a javascript based UI and
> JHS.
> 
>> I still believe that both an ordering attribute and also a
>> dictionary, would be very useful assets to an array language, but
>> would need to be implemented “natively” within the language to be
>> fully useful (not an add-on, but part of the language notation, as
>> per kdb+ for dictionaries for example).
> 
> That could be, but (speaking personally): I am not going to optimize
> code that I don't have access to.
> 
> Thanks,
> 
> -- 
> Raul
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to