Hi guys,

We can do many different things when processing a search, and each of
them can collide :

o simple search
  o with time limit
  o with size limit
  o with both
o abandon request receoved
o session closed
o search with MandesDSAIt control
o paged search
  o cancelled paged search
  o new paged search
o persistent search
o replication

Of course, we can have many searches running at the same time for one
session...

The problem is that the way it currently works, beside the inherent
complexity, make it possible that the server ends with an OOM : we first
compute the set of candidates (it's a set of UUIDs), and we then fetch
all the entries one by one, writing them to the client. Of course, if
the client is not reading them fast enough, those entries are stored in
a queue, in memory, leading to potential OOM very quickly.

This is a real problem with the way we use MINA : we don't wait for the
entries to be actually *written* in the socket, we just push the entries
in the session.

In fact, there are three ways to write entries in MINA 2.
1) you do what we currently do, and we are at risk to get an OOM
2) for each write, we get back a WriteFuture, and we wait on this
writeFuture for the message to be in the socket. The pb with this
approach is that we block the thread until the entry is sent fully. It
works because we use an executorFilter, but this executor filter does
not have an infinite number of threads in its pool. At some point, we
will block completely
3) there is a smartest way, but way more complex : we just write the
first entry, and that's it. When the first message is physically sent,
we will get a MessageSent event, and we can now process the next entry.
Etc. Of course, we will have as many MINA <-> ADS communication as we
have entries to send, an overhead we have to consider (it's around 5 to 10%)
However, doing so guarantee that we don't push anything in the queue :
all the entries are flushed one by one, we don't block any thread.

Atm, I have started to implement the third solution, and it works pretty
well for the simple search. Now, it gets very complex for the
pagedSearch, as it's a multiple SearchRequest operation, and we have to
deal with many different possibilities.

The persistentSearch and replication is based on a persistent search) is
also a different case : we can't know when an entry is being modified,
so we can just push the entries one by one, expecting we don't have
thousands of them. I don't see any other solution for persistentSearch,
we can just improve the initial update using the mechanism describe in #3.

So, I'm working on all of that, it's partly implemented, but it will
take a but of time to complete the work.

Any thoughts are welcome.

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com 

Reply via email to