On Wed, Jan 6, 2010 at 11:55 PM, Lars Marowsky-Bree <[email protected]> wrote:
> On 2010-01-06T16:32:33, Dejan Muhamedagic <[email protected]> wrote:
>
>> > That said, I'm fine with using a file if it keeps up the performance.
>> > But the sync_script - we definitely don't want to be calling an external
>> > script, I guess. csync2 would be automatic anyway.
>>
>> The script (or program) is invoked once in a while on monitor,
>> shouldn't be a performance issue.
>
> It's invoked every time, and to keep the connection table reasonably
> uptodate, I guess monitor would be scheduled fairly frequently.
>
> So yes, this is costly, the question is whether it is still good enough
> ;-) In particular over drbd, it'd be good to see a test and establish
> what a reasonable limit is.
>
>> > Oh. This is quite costly, is it not? Scanning the full TCP connection
>> > table and regenerating the whole lease-file?
>> Looking again at it, egrep and while loop could be replaced by a
>> perl/awk/python snippet. Then it should be fast enough.
>
> We're looking at scanning potentially thousands of connections here,
> which a busy file server might easily have. There's not just the
> processing overhead in user-space, but also the overhead of retrieving
> the information from the kernel, and I doubt that that is extremly fast.
> Worse, scanning those tables might even incur an in-kernel penalty due
> to data structure locking.
>
> But that's just a gut feeling, I'd be happy to see some numbers.
>
>> > Is there really no interface by which we can be notified when TCP
>> > connections get established and deleted?
>> In that case we need a daemon to take care of the connection
>> table. That would obviously be a more elegant solution, but I'm
>> really not sure if it's absolutely needed. This machinery seems
>> good enough to me. At least let's first see how it behaves in
>> some busy environment.
>
> Yes, and that's why I'd like to see some numbers ;-)
>
>> > How is this done in Samba?
>
> Further, has there been some communication with say the netfilter group
> or the internal Labs kernel team as to how to get at that state?
>
> I'm not saying the current approach is bad, I just want to understand
> why it is good enough ;-)
Besides the current approach I have implemented, there are some other
approaches which have been suggested by so many people. Sorry for I
haven't list all your names here, but I really very appreciated for
all the suggestions and you all :)
Now, I would like to collect all your ideas and suggestions (also from
the talk about one year ago in this list):
For the tickle ACK feature, there are 3 parts we should handle:
1) how to let the client quickly reset:
This is the core of the tickle ACK idea. Since just a RST from
server to client may not work since the server doesn't know the
sequence number and that packet may be ignored by the client, so we
need the tickle ACK, which is a pseudo ACK to trigger the client to
send back a genuine ACK, then the server can succeed to RST the
connection. Please note that this function works when the client sends
out some packets and is waiting for the ACK from server but the
original server is down. Cutter and CTDB has the same idea.
2) how to collect the established TCP connections:
there are 3 ways for now:
A) netstat -tn : a simple command, Samba is using it.
B) conntrackd: more precise, it also provide the function to sync
the TCP connections in the cluster.
C) in-kenel tcp_diag.c and develop a user-space program to talk to
it: should investigate and implement it.
3) how to sync the TCP connection info in the cluster:
A) using a file: very simple if you have a file-level shared
storage, but if not, you could also use csync2.
B) openAIS CKPT: may need a seperate clone RA which is calling the
CKPT service to get the TCP connection info.
C) conntrackd: also have this function, but I haven't investigate
more about it (assume that we need a config file pre-configured to get
the membership list?)
For the part 1, I think we have no other choice, also the
implementation works in my testing.
For the part 2 and part 3, we have 3 choices each (I randomly list the
three). Each choice has its pros and cons. I hope I haven't missed
anything (if I do, please point out:)
Thanks,
Jiaju
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/