Re: Fwd: 4GB Memory question
3- In RHEL5 there's no need for a specific hugemem kernel anymore as the kernel is smart enough to decide during boot what kind of technology should it use. That does not make sense to me. The kernel can find out whether it needs more than 1GB for the kernel space during boot according to the amount memory available, and it can decide whether it needs PAE. I don't see how it can decide whether any user space program needs more than 3GB of space, however, during boot. - Noam Shachar This is from the release notes of RHEL5 (Kernel Notes). I tend to believe that it applies on hugemem as well. o X86 SMP alternatives o optimizes a single kernel image at runtime according to the available platform o ref: [16]http://lwn.net/Articles/164121/ - Noam
Re: Copying and pasting Hebrew text Firefox-OOo
On 13/05/07, ik [EMAIL PROTECTED] wrote: Hi, The website as I see it is CP1255 and not iso-8859-8 ! I tested what You have tried to do, but I'm able to paste the Hebrew the same as it looks on firefox 2.0.0.3. My OpenOffice is 2.2.0 under KUbuntu 7.0.4 Ido Turns out that the site itself is part cp-1255, and part iso-8859-8, which all copy and paste fine. The emails they send are Visual Hebrew, not Logical, so the text is BACKWARDS. There is some mechanism ensuring that the text is flowed from left to right, so that it appears correct. It's all tables too, because linebreaks would screw it up. So OOo is pasting correctly, it's just that 'correctly' is backwards. I'll write to them and tell them how stupid they are. Also, the registration of 999.co.il does not work for anything other than Internet Explorer. There is some JavaScript in the signup form that does not work with Konqueror/Firefox/Opera. I suggest that some people call them (I did) to let them know. The more we complain, the more chance that they will fix it. Dotan Cohen http://what-is-what.com/what_is/html.html http://technology-sleuth.com/short_answer/what_is_a_cellphone.html = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Fwd: 4GB Memory question
On Mon, May 14, 2007 at 09:51:15AM +0300, Noam Meltzer wrote: This is from the release notes of RHEL5 (Kernel Notes). I tend to believe that it applies on hugemem as well. o X86 SMP alternatives o optimizes a single kernel image at runtime according to the available platform o ref: [16]http://lwn.net/Articles/164121/ Hmm, this has nothing to do with PAE though... rather it's dynamic run-time code patching to switch from uniprocessor to SMP mode. Cheers, muli = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Firefox accelerator keys workaround addon?
On Fri, 11 May 2007 09:07:47 +1000, Amos Shapira [EMAIL PROTECTED] writes: I've just heard about the following addon which fixes the problem of accelerator keys for russian keyboards: https://addons.mozilla.org/en-US/firefox/addon/3529 Does any of the mozilla programmers here reckon they can tweak it to work for Hebrew? The latest version of it (1.4) partially works for Hebrew too. Most of the keys (except for Q and W) work. Especially the most needed C-f and C-t. = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Fwd: 4GB Memory question
Noam Meltzer wrote: This is from the release notes of RHEL5 (Kernel Notes). I tend to believe that it applies on hugemem as well. o X86 SMP alternatives o optimizes a single kernel image at runtime according to the available platform o ref: [16]http://lwn.net/Articles/164121/ - Noam While I cannot rule out that they did the same for hugemem, it still leaves in the question of boot time detection. I'll pose a wild guess as to what is done, and you tell me how likely it is: At boot time, RHEL 5 tests whether the machine has more than 4GB of ram. If it does, it turns on both PAE and 4/4 split. Otherwise, it's no PAE and 3/1 split What do you think? Shachar = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
need some help with tcp/ip programming
Hi, as a pproject for a company I'm writing a tcp/ip application on linux using C language. My application has 2 connections as client to remote servers and is by itself a server accepting remote client connections. I'm using select() mechanism to manage all those connections. Everyting works nicely until any of the remote sideds disconnects. In some of such cases, I'm not succeeding to reconnect to this remote as client or re-accept connection from tjhe remote client. Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. So, I have a couple of questions and I'll most apreciate any assistance. 1. Would you confirm that select, indeed, does not detect each and every remote disconnection and do you know if there is a rule behind those cases? 2. If I need to detect those disconnections to react propperly in my application and I can not rely on select, what would you suggest me to do? should I use a kind of ping mechanism to check every some seconds if the connection is still alive? or may be use multithread instead of select, where each thread is responsible for each connection source and instead of select I loop on read from this source and so, can detect when I read 0 bytes, which is the disconnect indication and react accordingly? Does this make sense or you see issues in such implementation also? Thanks, Rafi.
Re: Firefox accelerator keys workaround addon?
That's great! IMHO it is the biggest Firefox annoyance. This addon fixes the problem indeed. Thanks for the info! Hadar. On 5/14/07, Yair Friedman [EMAIL PROTECTED] wrote: On Fri, 11 May 2007 09:07:47 +1000, Amos Shapira [EMAIL PROTECTED] writes: I've just heard about the following addon which fixes the problem of accelerator keys for russian keyboards: https://addons.mozilla.org/en-US/firefox/addon/3529 Does any of the mozilla programmers here reckon they can tweak it to work for Hebrew? The latest version of it (1.4) partially works for Hebrew too. Most of the keys (except for Q and W) work. Especially the most needed C-f and C-t. = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
Rafi Cohen wrote: Hi, as a pproject for a company I'm writing a tcp/ip application on linux using C language. ah welcome, welcome to the pleasure dome... My application has 2 connections as client to remote servers and is by itself a server accepting remote client connections. I'm using select() mechanism to manage all those connections. Everyting works nicely until any of the remote sideds disconnects. In some of such cases, I'm not succeeding to reconnect to this remote as client or re-accept connection from tjhe remote client. you're not describing things properly here, since later you say you are not managing to disconnect (the problem is not with re-connecting - which, if happened, would imply a different problem altogether) Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. So, I have a couple of questions and I'll most apreciate any assistance. 1. Would you confirm that select, indeed, does not detect each and every remote disconnection and do you know if there is a rule behind those cases? TCP/IP stacks on linux, by default, will only notice a network disconnection (i.e. the network went down) reliably after 2 hours. that's how TCP/IP's internal keep alive mechanism is set. 2 hours is a completely impractical value for any sane system you might develop. you can tweak this parameter of the TCP/IP stack for specific sockets, on current linux kernels, using a socket option. (man 7 tcp - and look for 'keepalive' - there are 3 parameters for this). i never used this mechanism, since it was only possible to make this change globally when i needed that. 2. If I need to detect those disconnections to react propperly in my application and I can not rely on select, what would you suggest me to do? should I use a kind of ping mechanism to check every some seconds if the connection is still alive? or may be use multithread instead of select, where each thread is responsible for each connection source and instead of select I loop on read from this source and so, can detect when I read 0 bytes, which is the disconnect indication and react accordingly? read would fail just like select does, and because of the same reason. you could implement the keepalive in your application, in case the keepalive parameters tweaking of the TCP stack does not work, for some reason. --guy = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
^T under ubuntu w hebrew keyboard
hi i have an anoying problem with ubuntu when choosing hebrew keyboard layout: all the control keys (e.g. ctrl-T for new tab in FF) doesnt work ... it is probably mapped to CTRL-ALEPH instead, which is no use to me anyway is there a ubuntu build in solution ? if not, anyone has a modmap file ? or any othe solution ? erez.
Re: Fwd: 4GB Memory question
On 5/14/07, Shachar Shemesh [EMAIL PROTECTED] wrote: While I cannot rule out that they did the same for hugemem, it still leaves in the question of boot time detection. I'll pose a wild guess as to what is done, and you tell me how likely it is: At boot time, RHEL 5 tests whether the machine has more than 4GB of ram. If it does, it turns on both PAE and 4/4 split. Otherwise, it's no PAE and 3/1 split Uhm.. I'm not certain. The PAE kernel *is* for =4GB. (Though, according to Hetz' observation it seems more like =3GB) [EMAIL PROTECTED] /mnt] $rpm -qp --qf '%{Description}\n' rh1/Server/kernel- PAE-2.6.18-8.el5.i686.rpm 2 /dev/null This package includes a version of the Linux kernel with support for up to 64GB of high memory. It requires a CPU with Physical Address Extensions (PAE). The non-PAE kernel can only address up to 4GB of memory. Install the kernel-PAE package if your machine has more than 4GB of memory. So it seems you are correct. Anyhow, AFAIK, up until now (it means RHEL3 4) there were no use of PAE. So, is it possible that PAE technology, in a way, replaces the hugemem? - Noam
Sound on IBM X31
I recently upgraded my kernel to 2.6.18 (Debian package) After upgrading, sound does not work. lspci shows sound modules loaded (oss modules) However gnome volume control says GStreamer plugin not found. Does anyone have any idea? -- Ori Idan
Re: Sound on IBM X31
Hi, apt-get install aumix. See if aumix works. Also check if you have the udev enabled (if you had it previously) and check permissions on devices like /dev/dsp etc.. Thanks, JHetz On 5/14/07, Ori Idan [EMAIL PROTECTED] wrote: I recently upgraded my kernel to 2.6.18 (Debian package) After upgrading, sound does not work. lspci shows sound modules loaded (oss modules) However gnome volume control says GStreamer plugin not found. Does anyone have any idea? -- Ori Idan -- Skepticism is the lazy person's default position. Visit my blog (hebrew) for things that (sometimes) matter: http://wp.dad-answers.com = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Fwd: 4GB Memory question
Hi Shachar! On Monday 14 May 2007, Shachar Shemesh wrote: Noam Meltzer wrote: Hi, Quick answer is no. A bit longer answer is: 1- PAE refers to a certain technology avail. in the CPU which allows 32bit kernels to address larger address spaces. 2- Hugemem is a technology which changes the ratio between the user space and kernel space from 3GB/1GB to 4GB/4GB. (So the actually virtual memory refers to the same physical memory) It just gives your processes a bit more a breathing space before starting unmapping/mapping memory from highmem zone to the normal zone. Actually, if you read the original 4/4 patch, you will see that the two are not as unrelated as it may sound. Can you please put one line of spacing separating between quoted text and text that you yourself have written in reply? KMail corrected it in the quoted message, but it still appears in the original one by you. The way you're doing it now makes it harder to read. See: http://www.mail-archive.com/linux-il%40cs.huji.ac.il/msg48634.html Regards, Shlomi Fish - Shlomi Fish [EMAIL PROTECTED] Homepage:http://www.shlomifish.org/ If it's not in my E-mail it doesn't happen. And if my E-mail is saying one thing, and everything else says something else - E-mail will conquer. -- An Israeli Linuxer = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
RE: need some help with tcp/ip programming
Hi Guy Rafi Cohen wrote: Hi, as a pproject for a company I'm writing a tcp/ip application on linux using C language. ah welcome, welcome to the pleasure dome... Hmm, thanks for your warm greetings. My application has 2 connections as client to remote servers and is by itself a server accepting remote client connections. I'm using select() mechanism to manage all those connections. Everyting works nicely until any of the remote sideds disconnects. In some of such cases, I'm not succeeding to reconnect to this remote as client or re-accept connection from tjhe remote client. you're not describing things properly here, since later you say you are not managing to disconnect (the problem is not with re-connecting - which, if happened, would imply a different problem altogether) You are correct, what I indeed meant to say is in order to re-connect, first I need to disconnect propperly and here lies my problem. Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. Yes, this does make sense and I need to check with the software developer to which mine is connecting remotely, if he indeed closes the socket when disconnecting. You gave me a clue, thanks. So, I have a couple of questions and I'll most apreciate any assistance. 1. Would you confirm that select, indeed, does not detect each and every remote disconnection and do you know if there is a rule behind those cases? TCP/IP stacks on linux, by default, will only notice a network disconnection (i.e. the network went down) reliably after 2 hours. that's how TCP/IP's internal keep alive mechanism is set. 2 hours is a completely impractical value for any sane system you might develop. you can tweak this parameter of the TCP/IP stack for specific sockets, on current linux kernels, using a socket option. (man 7 tcp - and look for 'keepalive' - there are 3 parameters for this). i never used this mechanism, since it was only possible to make this change globally when i needed that. I know of this option and will look into that deeper. May be I miss here something, but this option may be relevant for the case my application is a server. I wonder how it would affect, if at all, when my application is a client. 2. If I need to detect those disconnections to react propperly in my application and I can not rely on select, what would you suggest me to do? should I use a kind of ping mechanism to check every some seconds if the connection is still alive? or may be use multithread instead of select, where each thread is responsible for each connection source and instead of select I loop on read from this source and so, can detect when I read 0 bytes, which is the disconnect indication and react accordingly? read would fail just like select does, and because of the same reason. you could implement the keepalive in your application, in case the keepalive parameters tweaking of the TCP stack does not work, for some reason. --guy Thanks, Rafi. -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.467 / Virus Database: 269.7.0/803 - Release Date: 5/13/2007 12:17 PM = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
On 14/05/07, guy keren [EMAIL PROTECTED] wrote: Rafi Cohen wrote: Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). Basically - Rafi expects (as he should) that a read(fd,...)==0 after a select(2) call that indicated activity on fd means that the other side has closed the connection. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes (sorry, can't find a reference with a quick google, closest I got to might be: http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218). I don't remember what was the work-around to that. Another point to check - does the read(2) after select(2) return an error? See select_tut(2) for more details on how to program with select - you should check for errors as well instead of just assuming that read(2) must succeed (e.g. interrupt). Also while you are at it - check whether pselect(2) can help you improve your program's robustness. Maybe using poll(2) will help you around that (I also heard that poll is generally more efficient because it helps the kernel avoid having to re-interpret the syscall parameters on every call). --Amos
Re: ^T under ubuntu w hebrew keyboard
On 14/05/07, Erez D [EMAIL PROTECTED] wrote: hi i have an anoying problem with ubuntu when choosing hebrew keyboard layout: all the control keys (e.g. ctrl-T for new tab in FF) doesnt work ... it is probably mapped to CTRL-ALEPH instead, which is no use to me anyway is there a ubuntu build in solution ? if not, anyone has a modmap file ? or any othe solution ? I suspect you are hitting the same problem that I just posted a Firefox addon about - it's a Firefox idiocity, not a Ubunutu/gnome one. Lookup the firefox addons for Russian hot keys bugfix (or search the linux-il archives for a direct link). HTH, --Amos
RE: need some help with tcp/ip programming
Amos, thanks for the ideas. I thought about poll and will look into this. I'm cecking read also for errors (valies 0) but in this case there ven can not be errors. Since the socket is disconnected, select does not detect any event on this socket and so does not give me any opportunity to read from it and even get an error. But thanks anyway. Rafi. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Amos Shapira Sent: Monday, May 14, 2007 1:16 PM To: Linux-IL Subject: Re: need some help with tcp/ip programming On 14/05/07, guy keren [EMAIL PROTECTED] wrote: Rafi Cohen wrote: Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). Basically - Rafi expects (as he should) that a read(fd,...)==0 after a select(2) call that indicated activity on fd means that the other side has closed the connection. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes (sorry, can't find a reference with a quick google, closest I got to might be: http://forum.java.sun.com/thread.jspa?threadID=767657 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 messageID=4386218). I don't remember what was the work-around to that. Another point to check - does the read(2) after select(2) return an error? See select_tut(2) for more details on how to program with select - you should check for errors as well instead of just assuming that read(2) must succeed ( e.g. interrupt). Also while you are at it - check whether pselect(2) can help you improve your program's robustness. Maybe using poll(2) will help you around that (I also heard that poll is generally more efficient because it helps the kernel avoid having to re-interpret the syscall parameters on every call). --Amos
Re: Fwd: 4GB Memory question
Noam Meltzer wrote: So, is it possible that PAE technology, in a way, replaces the hugemem? Seems extremely unlikely to me. A few words about the technologies (since you brought up the distinction, I'm surprised it is relevant). On a 32 bit platform each process can address, at most, 4GB of linear memory. Ever since the move to 32 bit the segment registers are no longer used for addressing, and thus are irrelevant for address extension. Let's call the 4GB addressable memory the virtual memory space. This memory is, of course, mapped to physical memory by means of the MMU, generating page faults whenever an illegal page (whether because it is unmapped or because it is with invalid permissions) is accessed. This allows the kernel to swap out some physical memory area, and replace it with a new area from disk. This is how virtual memory works. PAE is but an extension to the virtual memory technique, but using unaddressable memory instead of the disk. The machine has 64GB of physical memory, but can only actually address 4GB at a time. Pages of physical memory are swapped in and out of the addressable PHYSICAL range by means of using the PAE, and then, using the MMU, into virtual space. So each of the 64GB physical memory is given a 4GB physical address (not concurrently, of course), and then given a 4GB virtual address for the sake of the actual running processes. Except we have a problem. Each time we need to switch between user space and kernel space, we need to have the kernel ready and available to us. This must be the case so we can actually handle whatever it is that triggered the move (hardware interrupt, software interrupt or trap). The way we do that is by keeping the entire memory allocated to the kernel (code + data) mapped to the top area of the virtual memory addresses, no matter where we are in the system. Whether we are in kernel space, or each and every running user space process, we always keep the kernel at the same addresses. Of course, if we are in user space we mark the addresses as non-readable, non-writeable, but that's ok, because we can tell the MMU that a certain page is only read/writeable if the CPU is in Ring 0, and the CPU automatically enters ring 0 in case of an interrupt (of any kind). Problem solved. Except there is one problem with this scenario. This scenario means that whatever memory we reserve for the kernel is subtracted from the *virtual* address space available to user space. Once we decide that the kernel reserves 1GB of addresses for its own use, no matter what program is running, these addresses can never be used for anything else, regardless of how much physical memory is available in the machine. So, what have we got so far? We have 4GB of address space, which represents the absolute maximum any user space program can hope to address simultaneously. No matter how much the actual machine has, a single process cannot hope to address (directly) more than 4GB of memory. We further reduce this number by allocating some of the addresses to the kernel, creating a split between user space and kernel space *address space*. How much do we split? Windows splits at 2GB boundary by default. This means that each user space program has a maximum of 2GB of memory available, and the kernel has 2GB of memory as well. We call this a 2/2 split. Linux, by default, allocates 1GB to the kernel, which leaves 3GB to each user space program. We call this a 3/1 split. Now here's the problem. Sometimes, when there is too much memory in the machine (using PAE), it may turn out that 1GB is not enough to keep track of what virtual address for which process belongs to which physical address. Merely managing the physical memory requires an overhead, and with too much overhead, 1GB is not enough. There are two possible solutions to this problem. The first is to increase the amount of memory allocated to the kernel. We could, for example, switch from allocating 1GB to the kernel in a 3/1 split to allocating 2GB to the kernel in a 2/2 split (like Windows). This, however, leads to the following absurd: the more physical memory you, the less memory each user space program can use! To avoid this problem, the 4/4 split was invented. What it does, basically, is to not keep the kernel's memory mapped during user space execution. In other words, each time an interrupt arrives, the kernel switches (I haven't looked at the actual code, but I'm assuming through a tiny piece of code that is constantly mapped) the MMU tables, as if a context switch occurred. This, of course, results in higher costs for calling kernel code, but allows us to allocate the entire 4GB address space to the kernel, while allocating the entire 4GB address space to each user space program. According to Noam, this is what hugemem means (I don't know whether that is the case). This is called the 4/4 split. Luckily, it is fairly simple to test whether a 4/4 split patch is installed in your kernel. All you have to do is try
Re: Fwd: 4GB Memory question
Well, It would be really nice if some1 on this list have a RHEL(/CentOS)5 at hand with =4GB RAM to test it. (Hetz?) - Noam On 5/14/07, Shachar Shemesh [EMAIL PROTECTED] wrote: Noam Meltzer wrote: So, is it possible that PAE technology, in a way, replaces the hugemem? Seems extremely unlikely to me. A few words about the technologies (since you brought up the distinction, I'm surprised it is relevant). On a 32 bit platform each process can address, at most, 4GB of linear memory. Ever since the move to 32 bit the segment registers are no longer used for addressing, and thus are irrelevant for address extension. Let's call the 4GB addressable memory the virtual memory space. This memory is, of course, mapped to physical memory by means of the MMU, generating page faults whenever an illegal page (whether because it is unmapped or because it is with invalid permissions) is accessed. This allows the kernel to swap out some physical memory area, and replace it with a new area from disk. This is how virtual memory works. PAE is but an extension to the virtual memory technique, but using unaddressable memory instead of the disk. The machine has 64GB of physical memory, but can only actually address 4GB at a time. Pages of physical memory are swapped in and out of the addressable PHYSICAL range by means of using the PAE, and then, using the MMU, into virtual space. So each of the 64GB physical memory is given a 4GB physical address (not concurrently, of course), and then given a 4GB virtual address for the sake of the actual running processes. Except we have a problem. Each time we need to switch between user space and kernel space, we need to have the kernel ready and available to us. This must be the case so we can actually handle whatever it is that triggered the move (hardware interrupt, software interrupt or trap). The way we do that is by keeping the entire memory allocated to the kernel (code + data) mapped to the top area of the virtual memory addresses, no matter where we are in the system. Whether we are in kernel space, or each and every running user space process, we always keep the kernel at the same addresses. Of course, if we are in user space we mark the addresses as non-readable, non-writeable, but that's ok, because we can tell the MMU that a certain page is only read/writeable if the CPU is in Ring 0, and the CPU automatically enters ring 0 in case of an interrupt (of any kind). Problem solved. Except there is one problem with this scenario. This scenario means that whatever memory we reserve for the kernel is subtracted from the *virtual* address space available to user space. Once we decide that the kernel reserves 1GB of addresses for its own use, no matter what program is running, these addresses can never be used for anything else, regardless of how much physical memory is available in the machine. So, what have we got so far? We have 4GB of address space, which represents the absolute maximum any user space program can hope to address simultaneously. No matter how much the actual machine has, a single process cannot hope to address (directly) more than 4GB of memory. We further reduce this number by allocating some of the addresses to the kernel, creating a split between user space and kernel space *address space*. How much do we split? Windows splits at 2GB boundary by default. This means that each user space program has a maximum of 2GB of memory available, and the kernel has 2GB of memory as well. We call this a 2/2 split. Linux, by default, allocates 1GB to the kernel, which leaves 3GB to each user space program. We call this a 3/1 split. Now here's the problem. Sometimes, when there is too much memory in the machine (using PAE), it may turn out that 1GB is not enough to keep track of what virtual address for which process belongs to which physical address. Merely managing the physical memory requires an overhead, and with too much overhead, 1GB is not enough. There are two possible solutions to this problem. The first is to increase the amount of memory allocated to the kernel. We could, for example, switch from allocating 1GB to the kernel in a 3/1 split to allocating 2GB to the kernel in a 2/2 split (like Windows). This, however, leads to the following absurd: the more physical memory you, the less memory each user space program can use! To avoid this problem, the 4/4 split was invented. What it does, basically, is to not keep the kernel's memory mapped during user space execution. In other words, each time an interrupt arrives, the kernel switches (I haven't looked at the actual code, but I'm assuming through a tiny piece of code that is constantly mapped) the MMU tables, as if a context switch occurred. This, of course, results in higher costs for calling kernel code, but allows us to allocate the entire 4GB address space to the kernel, while allocating the entire 4GB address space to each user space program. According to Noam, this is what hugemem means (I don't
Re: Fwd: 4GB Memory question
On Mon, May 14, 2007 at 01:47:35PM +0300, Shachar Shemesh wrote: PAE is but an extension to the virtual memory technique, but using unaddressable memory instead of the disk. The machine has 64GB of physical memory, but can only actually address 4GB at a time. Pages of physical memory are swapped in and out of the addressable PHYSICAL range by means of using the PAE, and then, using the MMU, into virtual space. So each of the 64GB physical memory is given a 4GB physical address (not concurrently, of course), and then given a 4GB virtual address for the sake of the actual running processes. Hmm? that doesn't sound correct. All PAE does it make it possible to have 36-bits PFNs in the PTEs, so that your physical addressability is up to 64GB. You *can* address all 64GB of physical memory at the same time. In other words PAE lets you map 4GB of virtual - 64GB of physical. Except we have a problem. Each time we need to switch between user space and kernel space, we need to have the kernel ready and available to us. This must be the case so we can actually handle whatever it is that triggered the move (hardware interrupt, software interrupt or trap). The way we do that is by keeping the entire memory allocated to the kernel (code + data) mapped to the top area of the virtual memory addresses, no matter where we are in the system. Whether we are in kernel space, or each and every running user space process, we always keep the kernel at the same addresses. Of course, if we are in user space we mark the addresses as non-readable, non-writeable, but that's ok, because we can tell the MMU that a certain page is only read/writeable if the CPU is in Ring 0, and the CPU automatically enters ring 0 in case of an interrupt (of any kind). Problem solved. This is misleading. We can have the kernel available for us just fine even if is not mapped in the user's address space. The reason it is mapped (on x86-32 only!) in every process's address space is to cut down on context switch costs, since we aren't really switching address spaces (which would necessitate a TLB flush). Now here's the problem. Sometimes, when there is too much memory in the machine (using PAE), it may turn out that 1GB is not enough to keep track of what virtual address for which process belongs to which physical address. Merely managing the physical memory requires an overhead, and with too much overhead, 1GB is not enough. Again, this is misleading. It's only a problem with the way it's implemented in *linux* on *x86-32*, using mem_map and allocating page tables from low-mem (which we don't do any more if you have CONFIG_HIGHPTE enabled). Alternative implementations are definitely possible. There are two possible solutions to this problem. The first is to increase the amount of memory allocated to the kernel. We could, for example, switch from allocating 1GB to the kernel in a 3/1 split to allocating 2GB to the kernel in a 2/2 split (like Windows). This, however, leads to the following absurd: the more physical memory you, the less memory each user space program can use! The less *virtual* memory each user space program can use in a single addres space is what you meant to say. It's trivial to fork() and thus get a second address space to play with. Additionally, you could use something like shared page tables to solve (or at least mitigate) the same problem. #include stdio.h #include stdlib.h #include sys/mman.h int main(int argc, char *argv[] ) { void *address=(void *)0xc000; /* start of the top 1GB */ if( argc1 ) { /* Ask for a specific address */ address=(void *)strtoul(argv[1], NULL, 0); } if( address==0 ) { fprintf(stderr, Must specify legal address as parameter, or give no parameter at all\n Use 0x prefix for hexadeciaml addresses\n); return 1; } printf(Trying to allocate 1 byte starting at address %p\n, address); void *alloced=mmap( address, 1, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); if( alloced==MAP_FAILED ) { perror(Failed to map memory); return 1; MAP_FIXED will make this simpler. I applaud your taking the time to write such a detailed explanation. Cheers, Muli = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Sound on IBM X31
How do I switch to ALSA on this machine? I have aumix installed. when trying to run it, I get error opening mixer. -- Ori Idan On 5/14/07, Vassilii Khachaturov [EMAIL PROTECTED] wrote: FWIW, I had my share of bad luck with OSS modules on IBM machines, specifically, those with the ICH5 AC97 and derivative on-board sound chips. Especially, with some games the sound wouldn't work because of unsupported 8-bit sound, IIRC. The solution was to switch to ALSA --- everything works like a charm there. Vassilii
Re: Sound on IBM X31
Hi, Use the command: alsamixer Thanks, Hetz On 5/14/07, Ori Idan [EMAIL PROTECTED] wrote: How do I switch to ALSA on this machine? I have aumix installed. when trying to run it, I get error opening mixer. -- Ori Idan On 5/14/07, Vassilii Khachaturov [EMAIL PROTECTED] wrote: FWIW, I had my share of bad luck with OSS modules on IBM machines, specifically, those with the ICH5 AC97 and derivative on-board sound chips. Especially, with some games the sound wouldn't work because of unsupported 8-bit sound, IIRC. The solution was to switch to ALSA --- everything works like a charm there. Vassilii -- Skepticism is the lazy person's default position. Visit my blog (hebrew) for things that (sometimes) matter: http://wp.dad-answers.com = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Fwd: 4GB Memory question
Muli Ben-Yehuda wrote: Hmm? that doesn't sound correct. All PAE does it make it possible to have 36-bits PFNs in the PTEs, so that your physical addressability is up to 64GB. You *can* address all 64GB of physical memory at the same time. In other words PAE lets you map 4GB of virtual - 64GB of physical. Then maybe the misunderstanding is mine. As far as I understand things, you cannot have the following simple loop: for( i=0; isize; ++i ) dst[i]=src[i]; copy data from one place to the other if size is over 4GB, even with PAE. Would you say that statement is false? If it isn't, then you cannot address more than 4GB of memory at the same time. I don't know the mechanics of PAE. Did it resurrect the dreaded segment registers? Is it a part of the MMU mapping? This is misleading. We can have the kernel available for us just fine even if is not mapped in the user's address space. The reason it is mapped (on x86-32 only!) in every process's address space is to cut down on context switch costs, since we aren't really switching address spaces (which would necessitate a TLB flush). I actually talked about both the 32 bit issue, as well as the context switch issue, later on. As there is no PAE on 64 bit anyways, I fail to see the point here. As a side note, I'll point out that as far as I understand it, 64 bit does NOT avoid this problem. It just uses the end of the 64bit address range, instead of the 32 bit address range. This means that 32 bit programs running on 64 bit platforms never see the missing part (as their virtual address space is truncated early anyways), and 64 bit programs will not have any problems today because nobody has yet figured out a way to exhaust 16 exabyte of memory. This does NOT mean, however, that there is no theoretical problem. Again, this is misleading. It's only a problem with the way it's implemented in *linux* on *x86-32*, using mem_map and allocating page tables from low-mem (which we don't do any more if you have CONFIG_HIGHPTE enabled). Alternative implementations are definitely possible. You still need to keep track of which virtual memory belonging to which context goes in which physical address/swap page. The more physical storage you have, the more you have to keep track of, the more memory you need for that. This means that you may delay the problem, but more memory always seems to mean more overhead. Again, I may be missing something here. the more physical memory you, the less memory each user space program can use! The less *virtual* memory each user space program can use in a single addres space is what you meant to say. That's total nitpicking, but ok. Just replace program with process and I stand by my original statement. It's trivial to fork() and thus get a second address space to play with. Additionally, you could use something like shared page tables to solve (or at least mitigate) the same problem. Yes, you could do all those. You can also map a large file on disk to memory in segments, picking whatever you currently need to use. There are lots of way to mitigate the problem And they require special handling by the program. My point was that the basic kernel-managed resource called virtual memory becomes scarcer as more physical memory is added, which is something of an absurd, which is why the 4/4 patch was originally created. MAP_FIXED will make this simpler. While true, MAP_FIXED will also fail if the specific address you asked for happened to be allocated, even if memory in the tested region is available. To me, this means it is less effective as a detection tool, as it generates false negatives. I applaud your taking the time to write such a detailed explanation. Despite the inaccuracies? :-) Thanks. Cheers, Muli Shachar = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Fwd: 4GB Memory question
On Mon, May 14, 2007 at 02:42:11PM +0300, Shachar Shemesh wrote: Muli Ben-Yehuda wrote: Hmm? that doesn't sound correct. All PAE does it make it possible to have 36-bits PFNs in the PTEs, so that your physical addressability is up to 64GB. You *can* address all 64GB of physical memory at the same time. In other words PAE lets you map 4GB of virtual - 64GB of physical. Then maybe the misunderstanding is mine. As far as I understand things, you cannot have the following simple loop: for( i=0; isize; ++i ) dst[i]=src[i]; copy data from one place to the other if size is over 4GB, even with PAE. Would you say that statement is false? If it isn't, then you cannot address more than 4GB of memory at the same time. You are confusing *virtual* memory and *physical* memory. PAE has nothing to do with virtual memory and everything to do with physical memory. On a 32-bit platform, pointers are limited to addressing 4GB (== 2^32) of virtual memory. However, the 32-bit x86 platform also had a related limitation with regards to *physical memory*. Actually, two limitations: first, there were only 32 address lines to the memory bus, so only addresses in the range 0-4GB could be communicated to the memory bus. Second, the page table format (specifially, the page table entry (PTE) format) specified only 32 bits for the physical address. PAE solved both of these issues: 4 more address lines were added, and the PTE format was changed to 36-bits for the physical address. That's how you get 4GB of virtual adddressability and up to 64GB of *physical* addressability. I don't know the mechanics of PAE. Did it resurrect the dreaded segment registers? Is it a part of the MMU mapping? See above, or your favorite search engine :-) Again, this is misleading. It's only a problem with the way it's implemented in *linux* on *x86-32*, using mem_map and allocating page tables from low-mem (which we don't do any more if you have CONFIG_HIGHPTE enabled). Alternative implementations are definitely possible. You still need to keep track of which virtual memory belonging to which context goes in which physical address/swap page. The more physical storage you have, the more you have to keep track of, the more memory you need for that. This means that you may delay the problem, but more memory always seems to mean more overhead. More accurate would be to say that more physical memory requires more work to keep track off, but you can trade-off computation and storage when performing that work. My point was that the basic kernel-managed resource called virtual memory becomes scarcer as more physical memory is added, which is something of an absurd, which is why the 4/4 patch was originally created. .. and my point was that this is a Linux design and implementation issue, rather than a universal truth (unless you restate it as I did above in terms of computation vs. storage overhead). Cheers, Muli = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: ^T under ubuntu w hebrew keyboard
El lun, 14-05-2007 a las 11:08 +0300, Erez D escribió: hi i have an anoying problem with ubuntu when choosing hebrew keyboard layout: I had a similar problem with Kubuntu, but not with Ubuntu. AFAIK the problem is with KDE and not only with the *ubuntu distros. all the control keys (e.g. ctrl-T for new tab in FF) doesnt work ... it is probably mapped to CTRL-ALEPH instead, which is no use to me anyway Check your keyboard properties at the Gnome panel. is there a ubuntu build in solution ? if not, anyone has a modmap file ? or any othe solution ? If you are using KDE, follow this thread[ 1]or search at the archive also at[ 1]. Links: http://www.mail-archive.com/linux-il@cs.huji.ac.il/msg47621.html Julian erez. -- Julian Daich [EMAIL PROTECTED] = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Fwd: 4GB Memory question
Muli Ben-Yehuda wrote: You are confusing *virtual* memory and *physical* memory. PAE has nothing to do with virtual memory and everything to do with physical memory. I don't think I am. The simple truth of the matter is that it is not possible to access physical memory directly (at least, not unless you are in 16 bit real mod, in which case your constraints are much much harsher than 32 bit, they are more like 20 bit). As such, since a single linear pointer can only be 32 bit, you cannot simultaneously access more than 4GB of memory consecutively. Yes, there are plenty tricks you can do. You can resurrect segmented addresses. You can juggle the physical memory around the virtual addresses. You can (as Linux does), allocate the different physical addresses to different contexts. Either way, this is not as simple as merely accessing the full 64GB as if they were one contiguous memory (which is what you could do with 64bit platform). I don't know the mechanics of PAE. Did it resurrect the dreaded segment registers? Is it a part of the MMU mapping? See above, or your favorite search engine :-) Which roughly translates to if you wish. Modern operating systems have made us used to not touching the segment registers (which is a good thing), so we are used to only using what Intel provides us in the offset section of the address. You could, however, use the segments to point to different entries in the PTE, and thus access two 4GB chunks simultaneously. More accurate would be to say that more physical memory requires more work to keep track off, but you can trade-off computation and storage when performing that work. Yes, I guess you could, theoretically, do some level of memory-CPU trade off here. I don't see that it will actually bring you down to O(1) memory usage, since you do need to keep track of more than what's used and what's free, but this really delves into serious nitpicking, and I should stop here. ... and my point was that this is a Linux design and implementation issue, rather than a universal truth (unless you restate it as I did above in terms of computation vs. storage overhead). Ok. So let's agree that Linux requires more memory overhead the more physical memory is available, which results in the above mentioned trade off (more physical - less virtual), which is why the 4/4 split patch was originally written. I am sure the difference was crucial to the understanding of my explanation about the difference between PAE and highmem. Happy? Cheers, Muli To save you on the urge to use a magnifying glass to look for more inaccuracies and things which are not 102.5% correct, I misspelt a few of the words in this email. Wouldn't want to have a good nitpicking thread stop Shachar = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Fwd: 4GB Memory question
On Mon, May 14, 2007 at 03:20:10PM +0300, Shachar Shemesh wrote: Muli Ben-Yehuda wrote: You are confusing *virtual* memory and *physical* memory. PAE has nothing to do with virtual memory and everything to do with physical memory. I don't think I am. The simple truth of the matter is that it is not possible to access physical memory directly (at least, not unless you are in 16 bit real mod, in which case your constraints are much much harsher than 32 bit, they are more like 20 bit). As such, since a single linear pointer can only be 32 bit, you cannot simultaneously access more than 4GB of memory consecutively. Again, you can access 4GB of *virtual* memory consecutively, but each 4K of that virtual memory can address *any* physical address between 0 and 64GB (assuming x86-32-pae). I hope we agree on the above. If not, see the explanation below. A linear pointer has a virtual adderss; that virtual addres is translated by the hardware to a physical frame number + offset. The translation mechanism used (on x86-32) is page tables. Now, the way the page tables work is that the translation for each page of virtual memory is completely independent of the translation of every other page of virtual memory. Therefore, it is possible to have in a single address space, two consecutive page-aligned virtual addresses, lets say 0x1000 and 0x2000 (4K page size on x86-32) such that the first one is translated to physical address 0x0 and the second one is translated to physical address 0x2 (8GB), which are obviously more than 4GB apart. Do I need to draw out the PGD, PMD and PTE that would lead to this translation? Yes, there are plenty tricks you can do. You can resurrect segmented addresses. You can juggle the physical memory around the virtual addresses. You can (as Linux does), allocate the different physical addresses to different contexts. Either way, this is not as simple as merely accessing the full 64GB as if they were one contiguous memory (which is what you could do with 64bit platform). Let me rephrase, because I think we're converging (and there I was having such fun...). On a 32-bit platform, a single virtual address space is limited to 4GB in size. But - *any* page in that address space can map *any* physical frame from 0-64GB (assuming PAE). Which roughly translates to if you wish. Modern operating systems have made us used to not touching the segment registers (which is a good thing), so we are used to only using what Intel provides us in the offset section of the address. You could, however, use the segments to point to different entries in the PTE, and thus access two 4GB chunks simultaneously. A PTE is a single entry (that's what the E stands for...) Not sure what you're trying to say here. Happy? Joyful. To save you on the urge to use a magnifying glass to look for more inaccuracies and things which are not 102.5% correct, I misspelt a few of the words in this email. Wouldn't want to have a good nitpicking thread stop I'm sorry, the compile is done and I must get back to work now. Nitpickingly yours, Muli = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
this is a great thread - i'm learning a lot by reading it, even though i've been programming sockets for years. thanks for the question, and thanks for all the great answers. Another point to check - does the read(2) after select(2) return an error? See select_tut(2) for more details on how to program with select - you should check for errors as well instead of just assuming that read(2) must succeed (e.g. interrupt). Also while you are at it - check whether pselect(2) can help you improve your program's robustness. there is a distinction between different types of read() errors. at the very least, if you set the non-blocking option, you will get a response indicating that there is no data. if you select a non-zero timeout, you get a different return value indicating the timeout was hit w/ no data. if the other side disconnects, you get a different failure, and finally, there is a catch-all error for other reasons. IIRC, this was also somewhat implementation specific. it's been awhile, and it may have been Linux/OS X differences, but i would recommend that you are looking at the appropriate man page for your stack. if you do a random google, you may get the netbsd/freebsd stack, which may be different. Maybe using poll(2) will help you around that (I also heard that poll is generally more efficient because it helps the kernel avoid having to re-interpret the syscall parameters on every call). this is interesting. can anyone provide more info on this? thanks, michael = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
(rafi - your quoting mixes your text with mine - you might want to fix this - it was very hard to read your letter). see my comments below: Rafi Cohen wrote: Hi Guy Rafi Cohen wrote: So, I have a couple of questions and I'll most apreciate any assistance. 1. Would you confirm that select, indeed, does not detect each and every remote disconnection and do you know if there is a rule behind those cases? TCP/IP stacks on linux, by default, will only notice a network disconnection (i.e. the network went down) reliably after 2 hours. that's how TCP/IP's internal keep alive mechanism is set. 2 hours is a completely impractical value for any sane system you might develop. you can tweak this parameter of the TCP/IP stack for specific sockets, on current linux kernels, using a socket option. (man 7 tcp - and look for 'keepalive' - there are 3 parameters for this). i never used this mechanism, since it was only possible to make this change globally when i needed that. and rafi responds: I know of this option and will look into that deeper. May be I miss here something, but this option may be relevant for the case my application is a server. I wonder how it would affect, if at all, when my application is a client. it does not matter if you have a client or a server - you want to know about network problems either way - unless you assume that the client is a GUI and the user will simply hit the 'cancel' button. since in your case it does not appear to be a GUI - rather a longer-living server, then you might want to handle the disconnection issues both on your side, and on the side of the server. note that for some applications, it is enough to have a 'close the socket if it was idle for X time' - i.e. if you didn't get any data from a socket during X minutes - you close it. Thanks, Rafi. hope this makes it clearer. --guy = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
[EMAIL PROTECTED] wrote: amos shapira wrote: Maybe using poll(2) will help you around that (I also heard that poll is generally more efficient because it helps the kernel avoid having to re-interpret the syscall parameters on every call). this is interesting. can anyone provide more info on this? the problem with select, is that it is unable to optimize handling of 'holes' in the file descriptor set. suppose that you need to select on file descriptors 2 and 4000. you need to pass info about all file descriptors up to 4000 (i.e. many '0' bits, and only two '1' bits, in the different select sets). with poll, you pass an array of the descriptors you care about. so the size of the array is proportional to the amount of descriptors you are interested in, while with select it is proportional to the numeric value of the largest descriptor you are interested in. note that this is relevant only for applications that have many open sockets. when you use poll, you can use the trick of having 2 theads - one polls on idle sockets (i.e. sockets that did not have I/O in the last X seconds), and one listens on 'active' sockets (i.e. sockets that had I/O in the last X seconds). this avoids the major problem with both select and poll - that after an event on a single socket, the info for all the sockets has to be copied to user space (when select/poll returns), and then to kernel space again (when invoking poll/select again). i think that people added epoll support in order to avoid waking the poll function altogether - by receiving a signal form the kernel with the exact info, instead of having to return from poll. --guy = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
Amos Shapira wrote: On 14/05/07, *guy keren* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Rafi Cohen wrote: Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. Basically - Rafi expects (as he should) that a read(fd,...)==0 after a select(2) call that indicated activity on fd means that the other side has closed the connection. if this is what he expects than, indeed, this is what happens. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. (sorry, can't find a reference with a quick google, closest I got to might be: http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218). I don't remember what was the work-around to that. you're describing an issue with JVM - not with linux. i never encountered such a problem when doing socket programming in C or C++. if you can find something clearer about this, that will be very interesting. Another point to check - does the read(2) after select(2) return an error? See select_tut(2) for more details on how to program with select - you should check for errors as well instead of just assuming that read(2) must succeed ( e.g. interrupt). Also while you are at it - check whether pselect(2) can help you improve your program's robustness. Maybe using poll(2) will help you around that (I also heard that poll is generally more efficient because it helps the kernel avoid having to re-interpret the syscall parameters on every call). it helps avoiding copying too much data to/from kernel space on a sparse sockets list, and it helps avoiding having to scan large sets in the kernel, to initialize its onw internal data structures. --guy = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
RE: need some help with tcp/ip programming
Thank you very much Guy and sorry for not writing the text in an approriate way. Usually, I reply above the original message, but this time tried to mix my comments close to your text so that they make sense and you don't loose the context. Next time I'll try to do better. Thanks for the most valuable information, Rafi. -Original Message- From: guy keren [mailto:[EMAIL PROTECTED] Sent: Monday, May 14, 2007 11:18 PM To: Rafi Cohen Cc: '[EMAIL PROTECTED] Org. Il' Subject: Re: need some help with tcp/ip programming (rafi - your quoting mixes your text with mine - you might want to fix this - it was very hard to read your letter). see my comments below: Rafi Cohen wrote: Hi Guy Rafi Cohen wrote: So, I have a couple of questions and I'll most apreciate any assistance. 1. Would you confirm that select, indeed, does not detect each and every remote disconnection and do you know if there is a rule behind those cases? TCP/IP stacks on linux, by default, will only notice a network disconnection (i.e. the network went down) reliably after 2 hours. that's how TCP/IP's internal keep alive mechanism is set. 2 hours is a completely impractical value for any sane system you might develop. you can tweak this parameter of the TCP/IP stack for specific sockets, on current linux kernels, using a socket option. (man 7 tcp - and look for 'keepalive' - there are 3 parameters for this). i never used this mechanism, since it was only possible to make this change globally when i needed that. and rafi responds: I know of this option and will look into that deeper. May be I miss here something, but this option may be relevant for the case my application is a server. I wonder how it would affect, if at all, when my application is a client. it does not matter if you have a client or a server - you want to know about network problems either way - unless you assume that the client is a GUI and the user will simply hit the 'cancel' button. since in your case it does not appear to be a GUI - rather a longer-living server, then you might want to handle the disconnection issues both on your side, and on the side of the server. note that for some applications, it is enough to have a 'close the socket if it was idle for X time' - i.e. if you didn't get any data from a socket during X minutes - you close it. Thanks, Rafi. hope this makes it clearer. --guy -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.467 / Virus Database: 269.7.0/803 - Release Date: 5/13/2007 12:17 PM = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
On 15/05/07, guy keren [EMAIL PROTECTED] wrote: I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. I wrote this to you, Guy. Rafi maybe used disconnect when he basically ment that the TCP connection went down from the other side while you seemed to hang on disconnect being defined as cable eaten by an aligator :). As long as Rafi feels happy about the replies that's not relevant any more, IMHO. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. Here is something about what I read before, it's the other way around, and possibly only relevant to UDP but I'm not sure - if a packet arrives with bad CRC, it's possible that the FD will be marked as ready to read by select but then the packet will be discarded (because of the CRC error) and when the process reads the socket it won't get anything. That would make the process get a 0 read right after select which does NOT indicate a close from the other side. http://www.uwsg.indiana.edu/hypermail/linux/kernel/0410.2/0001.html I don't know what would be a select(2)-based work-around, if required at all. (sorry, can't find a reference with a quick google, closest I got to might be: http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 ). I don't remember what was the work-around to that. you're describing an issue with JVM - not with linux. i never encountered such a problem when doing socket programming in C or C++. if you can find something clearer about this, that will be very interesting. Yes, it was a JVM bug but it mentioned differences on Linux vs. other POSIX systems so I though it might be related. it helps avoiding copying too much data to/from kernel space on a sparse sockets list, and it helps avoiding having to scan large sets in the kernel, to initialize its onw internal data structures. Actually, epoll looks really cool, and Boost's ASIO seems to provide a portable C++ interface around it: http://asio.sourceforge.net/ On the other hand - if you are listening on many FD's which turn out to be ready then epoll apparently looses because it requires syscall (or kernel intervention) on every single FD, making select(2) (/poll(2)?) more attractive. Cheers, --Amos
Re: need some help with tcp/ip programming
Amos Shapira wrote: On 15/05/07, *guy keren* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. I wrote this to you, Guy. Rafi maybe used disconnect when he basically ment that the TCP connection went down from the other side while you seemed to hang on disconnect being defined as cable eaten by an aligator :). lets leave this subject. i brought it up, because many programmers new to socket programming are surprised by the fact that a network disconnection does not cause the socket to close, and that the connection may stay there for hours. As long as Rafi feels happy about the replies that's not relevant any more, IMHO. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. Here is something about what I read before, it's the other way around, and possibly only relevant to UDP but I'm not sure - if a packet arrives with bad CRC, it's possible that the FD will be marked as ready to read by select but then the packet will be discarded (because of the CRC error) and when the process reads the socket it won't get anything. That would make the process get a 0 read right after select which does NOT indicate a close from the other side. http://www.uwsg.indiana.edu/hypermail/linux/kernel/0410.2/0001.html I don't know what would be a select(2)-based work-around, if required at all. first, it does not return a '0 read'. this situation could have two different effects, depending on the blocking-mode of the socket. if the socket is in blocking mode (the default mode) - select() might state there's data to be read, but recvmsg (or read) will block. if the socket is in non-blocking mode - select() might state there's data to be read, but recvmsg (of read) will return with -1, and errno set to EAGAIN. in neither case will read return 0. the only time that read is allowed to return 0, is when it encounters an EOF. for a socket, this happens ONLY if the other side closed the sending-side of the connection. ofcourse, whenever i did select-based socket programming, i always set the sockets to non-blocking mode. this requires some careful programming, to avoid busy-waits, but it's the only way to gurantee fully non-blocking behaviour. and people should also note that the socket should be set to non-blocking mode before calling connect, and be ready to handle the peculear way that the connect call works for non-blocking sockets. doing socket programming without referencing stevens' latest TCP/IP book is foolish. (sorry, can't find a reference with a quick google, closest I got to might be: http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218). I don't remember what was the work-around to that. you're describing an issue with JVM - not with linux. i never encountered such a problem when doing socket programming in C or C++. if you can find something clearer about this, that will be very interesting. Yes, it was a JVM bug but it mentioned differences on Linux vs. other POSIX systems so I though it might be related. probably not in this case. because the problem you originally described most likely does not exist. the other way around does exist, if one uses blocking sockets. but then again, no one uses blocking sockets in server software, unless they have a pair of reader+writer threads per socket - and even that may cause problems when shutting down the application. it helps avoiding copying too much data to/from kernel space on a sparse sockets list, and it helps avoiding having to scan large sets in the kernel, to initialize its onw internal data structures. Actually, epoll looks really cool, and Boost's ASIO seems to provide a portable C++ interface around it: http://asio.sourceforge.net/ On the other hand - if you are listening on many FD's which turn out to be ready then epoll apparently looses because it requires syscall (or kernel intervention) on every single FD, making select(2) (/poll(2)?) more attractive. besides epoll being non-portable, and thus it doesn't get used too
Re: need some help with tcp/ip programming
On 15/05/07, guy keren [EMAIL PROTECTED] wrote: Amos Shapira wrote: On 15/05/07, *guy keren* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. I wrote this to you, Guy. Rafi maybe used disconnect when he basically ment that the TCP connection went down from the other side while you seemed to hang on disconnect being defined as cable eaten by an aligator :). lets leave this subject. i brought it up, because many programmers new to socket programming are surprised by the fact that a network disconnection does not cause the socket to close, and that the connection may stay there for hours. As long as Rafi feels happy about the replies that's not relevant any more, IMHO. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. Here is something about what I read before, it's the other way around, and possibly only relevant to UDP but I'm not sure - if a packet arrives with bad CRC, it's possible that the FD will be marked as ready to read by select but then the packet will be discarded (because of the CRC error) and when the process reads the socket it won't get anything. That would make the process get a 0 read right after select which does NOT indicate a close from the other side. http://www.uwsg.indiana.edu/hypermail/linux/kernel/0410.2/0001.html I don't know what would be a select(2)-based work-around, if required at all. first, it does not return a '0 read'. this situation could have two different effects, depending on the blocking-mode of the socket. if the socket is in blocking mode (the default mode) - select() might state there's data to be read, but recvmsg (or read) will block. if the socket is in non-blocking mode - select() might state there's data to be read, but recvmsg (of read) will return with -1, and errno set to EAGAIN. in neither case will read return 0. the only time that read is allowed to return 0, is when it encounters an EOF. for a socket, this happens ONLY if the other side closed the sending-side of the connection. Is there an on-line reference (or a manual page) to support this? From what I remember about select, the definition of it returning a ready to read bit set is the next read won't block, which will be true for non-blocking sockets any time and therefore they weren't encouraged together with select. ofcourse, whenever i did select-based socket programming, i always set the sockets to non-blocking mode. this requires some careful programming, to avoid busy-waits, but it's the only way to gurantee fully non-blocking behaviour. and people should also note that the socket should be set to non-blocking mode before calling connect, and be ready to handle the peculear way that the connect call works for non-blocking sockets. Also there is the issue of signals. If you want robust programs then you'll have to use pselect. doing socket programming without referencing stevens' latest TCP/IP book is foolish. Sorry for being foolish, I learned TCP/IP from RFC's and socket programming from BSD4.2 sources in `86, Steven's book wasn't available then. :^) I since then read the early editions of his books (circa early 90's, I remember reading a volume while the later ones where still in the making), but it's been a while since I had to write a complete C socket program with select in earnest, and I accept that some interfaces may have changed over the years. These days, with pthreads being a mainstream, I'd consider using multiple threads. select() is nice when you absolutely *must* use a single thread (which was the case back when pthreads wasn't invented yet, or later when the various UNIX versions had their own idea on thread API's) but if you have so many connections that multiple threads will become a problem then a single thread having to cycle through all these connections one by one will also slow things down. Not to mention the signal problem and just generally the fact that one connection taking too much time to handle will slow the handling of other connections. A possible go-between might be to select/poll on multiple FD's then handing the work to threads from a thread pool, but such a job would be justifiable only for a large number of connections, IMHO. If you insist on using a single thread then select seems to be the underdog today - poll is just
Re: need some help with tcp/ip programming
guy keren wrote: Amos Shapira wrote: On 14/05/07, *guy keren* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Rafi Cohen wrote: Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. Basically - Rafi expects (as he should) that a read(fd,...)==0 after a select(2) call that indicated activity on fd means that the other side has closed the connection. if this is what he expects than, indeed, this is what happens. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. Jumping in in the middle here, I don't have any clear evidence (or any evidence at all) for what Amos was talking about, but I did run across this worrying change in Wine: http://www.winehq.org/pipermail/wine-cvs/2006-November/027552.html Now, it is totally unclear to me whether the fd leak in question is a result of a Wine bug around select, or of select itself. This may, after all, prove to be nothing important. Then again, being as it is that all Wine used to do was translate the Windows version of select to the almost identical Linux version, I find this worrying. Shachar = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
Amos Shapira wrote: in neither case will read return 0. the only time that read is allowed to return 0, is when it encounters an EOF. for a socket, this happens ONLY if the other side closed the sending-side of the connection. Is there an on-line reference (or a manual page) to support this? man 2 read From what I remember about select, the definition of it returning a ready to read bit set is the next read won't block, which will be true for non-blocking sockets any time and therefore they weren't encouraged together with select. I believe you two are talking about two different things here. There is a world of difference between UDP and TCP in that regard. UDP is connectionless. This means that read's error codes relate only to the latest packet received. UDP also doesn't have a 100% clear concept of what CRC/checksum actually means. I still think it's a bug for select to report activity on a socket that merely received a packet with bad checksum (there is no CRC in TCP/IP), as how do you even know it was intended for this socket? In TCP, on the other hand, a read is related to the connection. Packets in TCP are coincidental. Under TCP, read returning 0 mean just one thing - the connection is close. if you have so many connections that multiple threads will become a problem then a single thread having to cycle through all these connections one by one will also slow things down. No, my experience begs to differ here. When I tested netchat (http://sourceforge.net/projects/nch), I found out that a single thread had no problem saturating the machine's capacity for network in/out communication. As long that your per-socket handling does not require too much processing to slow you down, merely cycling through the sockets will not be the problem if you are careful enough. With netchat, I used libevent for that (see further on for details), so I was using epoll. Your mileage may vary with the other technologies. Not to mention the signal problem and just generally the fact that one connection taking too much time to handle will slow the handling of other connections. Yes, it is probably better to use a single thread that does the event waiting, and a thread pool for the actual processing. Having one thread pet socket, however, is not a wise idea IMHO. A possible go-between might be to select/poll on multiple FD's then handing the work to threads from a thread pool, but such a job would be justifiable only for a large number of connections, IMHO. It's not that difficult to pull off, and I believe your analysis failed to account for the overhead of creating new threads for each new connection, as well as destroying the threads for no longer needed connections. If you insist on using a single thread then select seems to be the underdog today - poll is just as portable (AFAIKT), and Boost ASIO (and I'd expect ACE) allows making portable code which uses the superior API's such as epoll/kqueue/dev/poll. Personally, I use libevent (http://www.monkey.org/~provos/libevent/), which has the advantage of being a C linkage program (ASIO is C++, as is ACE). It also has a standard license (three clause BSD). I skimmed over the boost license, and it doesn't seem problematic, but I don't like people creating new licenses with no clear justification. Shachar = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]