Re: Fast/List Concurrent settings?

Mueller, Doug Mon, 18 Jan 2010 10:32:52 -0800

Folks,

This thread has actually turned into several different conversations -- all
related to threads and queues and how the AR System server functions; but
several different topics.  To help share a bit of information about how the
system works and clarify what the purpose/intention around things are and to
offer some comment on the original question, I thought I would try and address
the various topics with some comments.

1) What is the difference between a queue and a thread?

Before going into the fast/list discussion, I want to make sure that everyone
is clear about a couple of terms:  queue and thread.  The fast and list are
actually queues in the system which in turn have one or more threads defined
for them.

In the AR System server, there are a set of queues.  There is the Admin, Fast,
List, and Escalation by default (OK, there are also some used for services like
the plugin server and such).  These are essentially connection points that you
can go to.  You can also have private queues that offer additional connection
points to the system.

Each of these queues has one or more threads -- each thread being a database
connection and a processing lane for an API call.

Think of it the following way.  If you were going to a futbol game in Spain
(that's soccer for those of us in the backwards United States), the stadium
generally has multiple queues you can enter.  Say in a simple case, they may
be on the N, E, S, and W of the stadium.  Now, there may be a special queue
in the NE for the "skybox" owners.

Each of the queues, the entrances to the stadium, has multiple lanes where
people can enter.  These are the threads.  Threads are local to the individual
queues.  You cannot be in the queue on the N of the stadium and go through an
entry lane that is on the S for example.

However, if one of the queues is closed, people get routed to one of the queues
that is open so you are not blocked out of the game just because the queue you
were targetting is not available, you just get routed to another one that is
available  (in the AR System case, the fast/list pair of queues is the one that
you get routed to if your specific queue is not available).

So, the system has a set of queues -- some pre-defined, some private and
defined per site -- and each of them has processing threads as configured.

No, this was not a topic brought up, but it is important to understand this
clearly for some of the topics that were.

2) Why fast/list?  Are they relevant?

One of the topics that was discussed is the fast vs. list queue and the
reasoning behind it.

As was noted, any queue in the system can perform any operation.  OK, almost...
The exception is that any operation that restructures definitions or changes
the database MUST go through the Admin queue and will be routed to that queue.
No queue other than the Admin queue will process restructure operations.

Anyway, other than that distinction, any queue in the system can perform any
non-restructuring API operation.

BMC has optimized the system to two different queues by default.

   Fast (just a name without intending to indicate performance)
   List (just a name as well but was aimed at things that search/scan/find and
         return lists of things)

List calls may be faster than Fast calls.  Fast calls may be faster than List
calls.

The "fast" queue gets all calls that are controlled by the developer and that
have discrete operations and activity.  This includes operations that create,
modify, delete.  It includes retrieving details of a single item given the
ID of that item.  It includes a lot of miscellaneous calls that have definitive
discrete operations where the end user is making a call/performing an operation
where they don't really have control over what the operation is going to do
at the end of the day.

The "list" queue gets all the calls that often are (not always but often can
be) affected by the end user or where the speed of operation is not always
controllable.  It includes the search calls and operations like export and
running processes on the server from an active link.  These operation are often
fast, but they have the potential to become long.  There is high variability to
the performance or throughput of the calls.  Depending on how well qualified or
what you are trying to retrieve, they are calls that can return little or large
amounts of data.  The user often has an influence into overall throughput or
performance because they often have some level of control over the
qualifications or the amount of data they can request.  (Yes, that is the
reason the Admin has been given lots of different ways to control how much data
and the way that the query can be constructed -- to control the performance and
impact of these calls).

It is still as relevant as it has always been to have the queues.  It allows
for the adjustment of the threads that are needed to focus on the two different
classes of operations.  Very often, system administrators will find that
adjusting the number of threads in one of these queues has a significant
impact on performance.  If there were not difference in the queues or the way
load of the system was by default split between them, this wouldn't really be
the case.

Also, in general, it is found that a higher number of threads in the list queue
than in the fast queue is an appropriate configuration of the system.  The vast
majority of the time, the variability of the interaction on the search calls
and the overall time spend on searching vs. creating/updating dictates that
more database connnections and processing threads related to searching will
give the system better throughput.

3) Dispatcher model for queue processing

Now, to the strategy for how queue processing works.  When a call comes into
the AR System server, it is targetted for some queue on the system.  If that
queue is not available, the call is redirected to a queue that is available.
If the target queue is available, the dispatcher for that queue accepts the
operation and then places it on a single list for the queue.  All items that
arrive in a queue are processed by that queue in the order received.  Once
arrived at a queue, that queue is responsible for processing -- even if there
is another queue somewhere else with fewer items to process.

Once things are in that internal list, the operation at the front of the list
is handed to the next available thread within that queue.  If there is a thread
immediately available, the request never sits in the list at all.  If all the
threads are already busy processing requests, the item will sit in the internal
list until one of the threads finishes and then the next item will start
processing.  There is no inherent limit of 5 items in the list and it is
important to note that the list of pending requests is at the queue level not
at the thread level.

4) The original topic -- how to deal with high volume blasts of concurrency

Now, we are back to the original topic of blasts of concurrent operations.

First, is it really the only way things work that there is no work and then
a blast of xxx simultaneous and then no work?  Could the other system have a
more steady stream of work rather than the periodic blasts of work?  Even if
there was some differentiation, it would balance the load -- of both systems.
Now, this all depends on the other system being able to spread things in some
way.  If it can, this would be a big step to any system for dealing with
volume.  If not, then we have to look at other things on the AR System side.

Assuming you cannot spread the load....

First, your note didn't indicate you were loosing operations or they were being
rejected.  That is because of the dispatcher model and all the operations are
being accepted and put on the list for processing, but they are not able to
all be processed simultaneously.  So, to start with, we are not loosing
operations, we just have a tuning issue to deal with.

With the queue model of the system, the first thing I would recommend is to
configure a private queue for the use of this system.  I assume your system is
not only for the use of this one automated processor.  So, the first thing I
would do is to define a private queue just for that automated system to isolate
the load of that system from other users and especially from interactive users.
You don't want to affect other users of the system when the "flood" hits from
this automated system.

Using a private queue will isolate the load -- just like in the stadium
example where you have a special entrance for the "unrully crowd".

Then, we can look at the number of threads that are appropriate for this
private queue.  That can be looked at independently of the number of threads
for other queues in the system for other purposes.

In theory, there is no reason you cannot have 10s or even 100s of threads in a
queue.  It is just a number to us.  We will start up that many if needed.  What
you need to be aware of is that each thread will open a database connnection
so you need to have a database allowing that many connnections AND that each
thread will take some amount of memory.  With the 7.5 release on UNIX and Linux
supporting a 64-bit address space, the memory can be grown more if needed
(still 32-bit on windows but 64-bit is coming).  You have to make the call
about how much memory you have, and performance of swapping, and overall
overhead of processes on the system for the number of threads you want to
configure.  You also have to worry about system configuration of per process
memory and file descriptors (open connections count as a file descriptor) and
other such things that you may encounter if configuring very large numbers of
threads.

BUT, there is no inherent restriction in the AR System about the number of
threads you could configure if you wanted to.

You already have seen some of the issues that high simultaneous operations on
a single table can do.

There are some settings that can help with that, for example

Next ID blocks -- limits contention on the next ID database column
No status history -- If you don't need it for your table, you can set this
   option and eliminate the create/management of the status history table and
   entries  (definitely available in 7.5; maybe in 7.1 since I cannot remember
   the release this option was added for)

If your logic has workflow that does pushes to other tables that perform
create operations there, you should really look at these settings on those
forms as well because you are indirectly issuing creates against them too.

There are options to control where your DB indexes are stored vs. the data to
put indexes and data on different disc areas -- which helps throughput on
heavy scale create/modify operations.

You should investigate operations your workflow is performing to make sure that
they are tuned as well as possible and that there are no inefficient searches
or steps to the process so that the total processing time for each create can
be minimized.   If you don't have to do something during the interactive
processing of the create, don't.  Don't do it at all if it is not needed or
perform the processing later if it is not necessary for interactive response
to the user.

Hopefully, this gives some ideas to look at.  It is not a definitive answer to
your inquiry, but hopefully some useful thoughts that lead you toward the
best answer for your situation.  And, maybe some ideas for others to consider.

Hopefully, this note has been useful in terms of providing some information
about how the system functions -- and maybe some reasoning behind why choices
were made.  I hope it also provided some ideas for dealing with high
simultaneous operations situations.

Doug Mueller

-----Original Message-----
From: Action Request System discussion list(ARSList) 
[mailto:arsl...@arslist.org] On Behalf Of Misi Mladoniczky
Sent: Monday, January 18, 2010 9:18 AM
To: arslist@ARSLIST.ORG
Subject: Re: Fast/List Concurrent settings?

Hi,

I do not think that we disagree here.

I checked some logs, and apparently my statement about "list" in the name
does not guarantee that the call is routed to list-thread. But the
ARGetListEntry() and ARGetListEntryWithFields() is definitely calls that
is routed to the List-threads, and the ARCreateEntry() and ARSetEntry() is
going to the Fast-threads.

My opinion is that the fast/list-threads has lost a little of their function.

This is definitely true if you look at the ARCreateEntry() and
ARSetEntry() calls, as they quite commonly performs a lot of stuff on the
server side, and is not fast at all.

Why would it be an advantage if you have queuing going on on either the
fast- or list-threads and the other type (fast or list) has no work to do?

It would be better to throw everything into the same pool, and remove the
settings for list/fast threads.

The admin queue, escalation queues and private queues still serve a valid
purpose.

I checked some server logs from ITSM7, and hear are the numbers:
Thread/Calls/Time
Fast/35881/~26 minutes
List/32543/~11 minutes
The "Fast" threads have a slightly higher number of calls, but user more
than double time per call. The "Fast" calls ran at 0.044 seconds per call,
and the "List"-calls ran at 0.019 seconds per call.

        Best Regards - Misi, RRR AB, http://www.rrr.se

Products from RRR Scandinavia:
* RRR|License - Not enough Remedy licenses? Save money by optimizing.
* RRR|Log - Performance issues or elusive bugs? Analyze your Remedy logs.
Find these products, and many free tools and utilities, at http://rrr.se.

> Misi, I know that you know AR System internals very well, and so normally,
> I
> would never question you.  So consider this response with that in mind, as
> well as the disclaimer that this is the last thing I was taught on the
> topic
> - if the architecture of the internals has changed, I must have missed
> that
> memo.
>
> The purpose behind Fast and List threads was to separate single-API
> functions (Submits, mostly) from multiple-API functions (pretty much
> everything else).  The thought, from what I can infer, was to have an
> express line of sorts so that quick DB transactions wouldn't have to wait
> in
> line behind long search/retrieve tasks.  If, as you infer, Fast/List
> threads
> no longer have any relevance in terms of different function, why do they
> still exist as separately tunable parameters?
>
> Rick
>
> On Mon, Jan 18, 2010 at 5:06 AM, Misi Mladoniczky <m...@rrr.se> wrote:
>
>> Hi,
>>
>> I do not think that updates from Web Services use the list threads. It
>> would be the same thing as the Save-button in a Modify-form.
>>
>> It performs a ARSetEntry()-call through a Fast-thread. It is performed
>> by
>> the Mid-Tier-server just as any other client.
>>
>> The Fast- and List-threads are just a blunt way to group and route
>> various
>> calls.
>>
>> If we exclude the admin-tool/developer-studio calls, I think that any
>> call
>> with "list" in it's name access the List-servers and any call without
>> "list" in its name go to a Fast-server.
>>
>> There is no technical difference between Fast and List threads. They
>> perform in the same way.
>>
>> As a Fast-call can be very slow, and List-calls very quick, I would
>> argue
>> that differentiating between these types of calls are not that useful.
>>
>> The calls to the fast an list threads typically comes in pairs anyway:
>>
>> An ACTL-Set-Fields performs two calls. One ARGetListEntry() to the
>> List-thread, and one ARGetEntry() to the Fast-thread. An
>> ACTL-Push-Fields
>> performs a ARGetListEntry() and an ARCreateEntry()/ARSetEntry() call.
>>
>> In the old days, I think that the List-threads and Fast-threads had
>> another distinction. When a user logged in, he was "assigned" to a
>> Fast-thread but accessed the List-thread in a more random way. The idea
>> was to minimize the calls to the transaction-daemon and portmapper to
>> find
>> the tcp-port of the fast-threads each time they were called. Now
>> everything goes to a single tcp-port.
>>
>> When we got the FLTR-Push-Fields calls, the ARCreateEntry() and
>> ARSetEntry() calls stopped being very fast. WebServices adds even more
>> to
>> this problem.
>>
>> In most cases, it is possible to rewrite workflow to be much faster. In
>> some cases, especially with the bigger apps such as ITSM, it is more
>> difficult. And you do not want to customize ITSM anyway...
>>
>>        Best Regards - Misi, RRR AB, http://www.rrr.se
>>
>> Products from RRR Scandinavia:
>> * RRR|License - Not enough Remedy licenses? Save money by optimizing.
>> * RRR|Log - Performance issues or elusive bugs? Analyze your Remedy
>> logs.
>> Find these products, and many free tools and utilities, at
>> http://rrr.se.
>>
>> > As an aside, technically your submits are going through the Fast
>> > (single-API) threads, while your updates from Web Services would use
>> the
>> > List (multiple-API) threads.  That's if your submit is allowed to
>> finish
>> > and
>> > then the ensuing record is updated by the WS.  If the WS return values
>> are
>> > part of the Submit process, then you have long transactions processing
>> > through and clogging up your Fast threads, which is not what they are
>> > intended to handle.  That will be the case regardless of the source of
>> the
>> > external data retrieval process or source, though the time they take
>> may
>> > be
>> > affected by using DB vs. WS.
>> >
>> > Rick
>> >
>> > On Fri, Jan 15, 2010 at 3:35 PM, Jason Miller
>> > <jason.mil...@gmail.com>wrote:
>> >
>> >> ** Just to make sure that I understand, the requesting system is
>> waiting
>> >> 40
>> >> seconds for the quote to be returned?  Does the quote return need to
>> >> done as
>> >> soon as possible or can there be a minute or two before the return?
>> >>
>> >> Maybe have them only submit the record and not trigger the processing
>> >> workflow.  Then have an escalation that will process the new
>> record(s)
>> >> and
>> >> then make a call to a web service on requester's end that is designed
>> to
>> >> accept the return from a previous transaction.  It would take some
>> >> redesigning on both sides but would alleviate the waiting and hanging
>> on
>> >> to
>> >> threads/connections.
>> >>
>> >> The other thing might be to offload some of the Remedy processing (if
>> >> there
>> >> is a lot) to more efficient scripts and/or direct database actions.
>> Web
>> >> services can be a wonderful thing but not always the fastest.
>> >>
>> >> Jason
>> >>
>> >>
>> >> On Fri, Jan 15, 2010 at 3:02 PM, LJ Longwing
>> >> <lj.longw...@gmail.com>wrote:
>> >>
>> >>> **
>> >>> LOL.....consider this Rick....Consider a 'submit' to be an interface
>> >>> that
>> >>> allows ALL attributes needed to completely configure a change with a
>> >>> single
>> >>> button press.  This may be a bad example...if it is I'm sorry...I
>> >>> really
>> >>> don't do ITSM....but the submit in question on our system actually
>> >>> averages
>> >>> about 40 seconds.  Makes no less than 4 calls to other external
>> systems
>> >>> via
>> >>> web services and performs more calculations than I care to think
>> about.
>> >>> That being what it is....what would you do in that situation?
>> >>>
>> >>>  ------------------------------
>> >>> *From:* Action Request System discussion list(ARSList) [mailto:
>> >>> arsl...@arslist.org] *On Behalf Of *Rick Cook
>> >>> *Sent:* Friday, January 15, 2010 3:55 PM
>> >>>
>> >>> *To:* arslist@ARSLIST.ORG
>> >>> *Subject:* Re: Fast/List Concurrent settings?
>> >>>
>> >>> ** If your submits are taking 10 seconds each, you have a
>> significant
>> >>> problem, my friend!
>> >>>
>> >>> Rick
>> >>>
>> >>> On Fri, Jan 15, 2010 at 2:52 PM, LJ Longwing
>> >>> <lj.longw...@gmail.com>wrote:
>> >>>
>> >>>> **
>> >>>> I completely agree with everything you said....but as you
>> mentioned,
>> >>>> if
>> >>>> you have 60 submits and have 10 threads...you are only handling 10
>> of
>> >>>> those
>> >>>> 60 concurrently....so if each transaction takes an average of
>> say...10
>> >>>> seconds...the first 10 will complete in 10, the next 10 in 20, and
>> so
>> >>>> on
>> >>>> with the final set of 10 taking a full 60 seconds from time to
>> >>>> complete to
>> >>>> process.  If your external application has a timeout of say....45
>> >>>> seconds
>> >>>> then you are only going to be able to handle 40 of the 60 submitted
>> >>>> concurrently and as such would have a 1/3 failure rate....in that
>> >>>> situation
>> >>>> would you then set your threads to say....15 to make it so that you
>> >>>> could
>> >>>> handle 60 in 40 seconds or would you take it to 60 to be able to
>> >>>> handle 60
>> >>>> in 10 seconds?
>> >>>>
>> >>>>  ------------------------------
>> >>>> *From:* Action Request System discussion list(ARSList) [mailto:
>> >>>> arsl...@arslist.org] *On Behalf Of *Rick Cook
>> >>>> *Sent:* Friday, January 15, 2010 3:39 PM
>> >>>> *To:* arslist@ARSLIST.ORG
>> >>>> *Subject:* Re: Fast/List Concurrent settings?
>> >>>>
>> >>>> ** I would think that one of the larger software-related issues
>> that
>> >>>> would affect the number of concurrent Inserts would be the Index
>> >>>> structure.
>> >>>> I am sure you know this, but for the benefit of those who don't,
>> >>>> indexes are
>> >>>> meant to help us search on a form more easily.  However, above a
>> >>>> certain
>> >>>> number (around 8 for a Remedy form), that cause performance
>> >>>> degradation when
>> >>>> creating a new record, because each of the indexes has to be
>> updated
>> >>>> as part
>> >>>> of the creation process.  If a system is seeing slow creates,
>> that's
>> >>>> the
>> >>>> first thing I check.
>> >>>>
>> >>>>
>> >>>> One other thing to think about is that since each thread can cache
>> 5
>> >>>> processes (that's the last number I heard, anyway) in addition to
>> the
>> >>>> one
>> >>>> currently being handled, you could, with 10 Fast (Single-API)
>> threads,
>> >>>> handle 60 concurrent create processes without a transaction loss.
>> >>>> There
>> >>>> would probably be a bit of a delay for those farther back in the
>> >>>> queue, but
>> >>>> if you had a reasonably robust system, most users wouldn't notice
>> it
>> >>>> much.
>> >>>> I seriously doubt all but a very few systems have to handle
>> anything
>> >>>> resembling that kind of concurrent load, with the exception of
>> those
>> >>>> who
>> >>>> have a large number of system (i.e. NMS) generated records.
>> >>>>
>> >>>> Also look at the Entry ID Block size when doing this test.  If -
>> AND
>> >>>> ONLY
>> >>>> IF - you are regularly having large numbers of concurrent inserts,
>> you
>> >>>> can
>> >>>> set the Entry ID Block size to something like 10 to cut down the
>> >>>> number of
>> >>>> requests to the DB for Entry IDs.  That is alleged to help with
>> create
>> >>>> times, though I have not seen that be the case in practical use.
>> >>>>
>> >>>> Rick
>> >>>>
>> >>>> On Fri, Jan 15, 2010 at 2:27 PM, LJ Longwing
>> >>>> <lj.longw...@gmail.com>wrote:
>> >>>>
>> >>>>> **
>> >>>>> Here is an interesting question for you thread experts out there.
>> >>>>>
>> >>>>> How many concurrent 'creates' does your system support?  Creates
>> >>>>> being a
>> >>>>> generic term for any given process that your system supports.  The
>> >>>>> system
>> >>>>> that I support is a home grown quote/order management system and
>> last
>> >>>>> year
>> >>>>> we stood up a web service interface for people to be able to
>> generate
>> >>>>> quotes
>> >>>>> from their systems and get pricing back.  The initial interface
>> was
>> >>>>> setup to
>> >>>>> handle 3 concurrent creates...but as soon as it was a success we
>> >>>>> started
>> >>>>> getting slammed with several hundred at a time and choking our
>> >>>>> system.
>> >>>>> Through rigorous testing and tweaking over the last couple of
>> weeks I
>> >>>>> have
>> >>>>> been able to get roughly 60 concurrent going through the system
>> with
>> >>>>> reasonable performance....so at this point I have my Fast set to
>> >>>>> 30/100
>> >>>>> min/max....confirmed that I'm not maxing anything specific
>> out....but
>> >>>>> I
>> >>>>> personally have never run above 20ish threads as a high because
>> most
>> >>>>> transactions are short and a fast thread count of 20 will handle
>> >>>>> hundreds of
>> >>>>> users in 'normal' operation.....so I was just wondering how many
>> >>>>> requests
>> >>>>> you guys have your system to handle concurrently....and just for
>> >>>>> verification...I'm talking about 'all of them hit the button
>> within a
>> >>>>> second
>> >>>>> of each other' type of concurrent...not 'I hove 400 people logged
>> on
>> >>>>> concurrently'
>> >>>>> _Platinum Sponsor: rmisoluti...@verizon.net ARSlist: "Where the
>> >>>>> Answers
>> >>>>> Are"_
>> >>>>
>> >>>>
>> >>>> _Platinum Sponsor: rmisoluti...@verizon.net ARSlist: "Where the
>> >>>> Answers
>> >>>> Are"_
>> >>>>  _Platinum Sponsor: rmisoluti...@verizon.net ARSlist: "Where the
>> >>>> Answers
>> >>>> Are"_
>> >>>>
>> >>>
>> >>> _Platinum Sponsor: rmisoluti...@verizon.net ARSlist: "Where the
>> Answers
>> >>> Are"_
>> >>> _Platinum Sponsor: rmisoluti...@verizon.net ARSlist: "Where the
>> Answers
>> >>> Are"_
>> >>>
>> >>
>> >> _Platinum Sponsor: rmisoluti...@verizon.net ARSlist: "Where the
>> Answers
>> >> Are"_
>> >>
>> >
>> >
>> _______________________________________________________________________________
>> > UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
>> > Platinum
>> Sponsor:rmisoluti...@verizon.net<sponsor%3armisoluti...@verizon.net>ARSlist:
>> "Where the Answers Are"
>> >
>> > --
>> > This message was scanned by ESVA and is believed to be clean.
>> >
>> >
>>
>>
>> _______________________________________________________________________________
>> UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
>> Platinum
>> Sponsor:rmisoluti...@verizon.net<sponsor%3armisoluti...@verizon.net>ARSlist:
>> "Where the Answers Are"
>>
>
> _______________________________________________________________________________
> UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
> Platinum Sponsor:rmisoluti...@verizon.net ARSlist: "Where the Answers Are"
>
> --
> This message was scanned by ESVA and is believed to be clean.
>
>

_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
Platinum Sponsor:rmisoluti...@verizon.net ARSlist: "Where the Answers Are"

_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
Platinum Sponsor:rmisoluti...@verizon.net ARSlist: "Where the Answers Are"

Re: Fast/List Concurrent settings?

Reply via email to