[chromium-dev] Re: Fwd: [chromium-dev] Re: OS X IPC Design doc
On Mon, Nov 17, 2008 at 12:25 AM, John Abd-El-Malek [EMAIL PROTECTED] wrote: I'm not trying to argue for one system or another, but I think things like sending 1MB or 7MB of data quickly, or up to 256MB, aren't actually needed by the existing code. That's my suspicion as well, but the code is there for large payloads; I'll sync up my windows build today and start collecting some stats on both message sizes and latency that we can use as a concrete reference point. --Amanda --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Chromium-dev group. To post to this group, send email to chromium-dev@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/chromium-dev?hl=en -~--~~~~--~~--~--~---
[chromium-dev] Re: Fwd: [chromium-dev] Re: OS X IPC Design doc
On Mon, Nov 17, 2008 at 8:59 AM, Amanda Walker [EMAIL PROTECTED] wrote: I'll sync up my windows build today and start collecting some stats on both message sizes and latency that we can use as a concrete reference point. I should say we here--Jeremy may already be a step ahead of me here :-). --Amanda --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Chromium-dev group. To post to this group, send email to chromium-dev@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/chromium-dev?hl=en -~--~~~~--~~--~--~---
[chromium-dev] Re: Fwd: [chromium-dev] Re: OS X IPC Design doc
Not to beat a dead horse, but looking at these tables, it seems that the decision to go one way or another can be made simply by looking at the 3 bottom rows. However, I'm pretty sure you won't find any existing messages that go over a few KBs. For any large data transfer, we use shared memory buffers. I'm not trying to argue for one system or another, but I think things like sending 1MB or 7MB of data quickly, or up to 256MB, aren't actually needed by the existing code. On Wed, Nov 12, 2008 at 4:46 PM, Jeremy Moskovich [EMAIL PROTECTED] wrote: We ran some benchmarks of Mach ports vs FIFOs on OSX, you can find the results in the Performance Considerations section of the Design doc. Best regards, Jeremy --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Chromium-dev group. To post to this group, send email to chromium-dev@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/chromium-dev?hl=en -~--~~~~--~~--~--~---
[chromium-dev] Re: Fwd: [chromium-dev] Re: OS X IPC Design doc
To my knowledge, we don't have specific speed requirements, so no, there's no simple test that could show whether or not a given mechanism is fast enough. However, as I noted yesterday, it should be fairly straightforward to extract some performance data from the windows build, which would at least give us information on the distribution of sizes and latencies to use as a point of comparison. --Amanda On Thu, Nov 13, 2008 at 9:54 AM, Dan Kegel [EMAIL PROTECTED] wrote: But without knowing what is fast enough, it's hard to make intelligent tradeoffs. It could well be that pipes are fast enough. It shouldn't be too hard to prove or disprove that, should it? -- --Amanda I'll say it again for the logic impaired. --Larry Wall --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Chromium-dev group. To post to this group, send email to chromium-dev@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/chromium-dev?hl=en -~--~~~~--~~--~--~---
[chromium-dev] Re: Fwd: [chromium-dev] Re: OS X IPC Design doc
Right on... We can definitely support custom implementations for each platform, and we should do that if it significantly lowers barriers and helps us get to a better product. I think it was reasonable to talk about sharing code here between Linux and OSX because this code, while small and contained, is something, much like the MessageLoop, that will invariably get a lot of attention due to very subtle issues (given our experience on Windows). It would be nice to share that kind of effort whenever it makes sense. -Darin On Thu, Nov 13, 2008 at 8:18 AM, Amanda Walker [EMAIL PROTECTED] wrote: Sure. But lets also put this into perspective. This is a small bit of code that is already wrapped by an API to hide the implementation details (which themselves already differ between platforms). It should be difficult neither to implement nor to re-implement as conditions or requirements change (or in this case, simply become clearer). Both the costs and the benefits of either sharing or not sharing the implementation between Mac OS X and Linux are low. So we're approaching a bikeshed discussion, where the people with a lot of Mac experience see one clear answer, and the people with a lot of Linux experience see another. A couple people raised questions about the performance differential that some of us asserted in favor of mach IPC, so we went at generated some objective data (which was quite worthwhile). The other factors you mention also point to using mach IPC on the Mac--the ease of exchanging shared memory objects is in fact where its speed advantage comes from for larger messages. For waitable events, there's no direct equivalent to a Windows waitable object, so some other mechanism (semaphores, condition variables, etc.) will be necessary on both the Mac and Linux, regardless. I'd like to not get into a long, drawn-out design by parade discussion on such a small module. If there are known requirements that haven't made it into the Windows code or design docs that someone could summarize, that would be great and will help all of us. In the absence of that, I think that going on general principles and prior experience, and doing an implementation bake-off , is a reasonable course of action. --Amanda On Thu, Nov 13, 2008 at 1:12 AM, Darin Fisher [EMAIL PROTECTED] wrote: Keep in mind that pipes are not really the fastest IPC mechanism for windows. Mike had a much faster shared memory based solution. However, we found that the pipe based solution was easiest to integrate with the sandbox, and it was also fast enough such that other factors outweigh the performance differential between the two mechanisms. I guess what I'm saying is that we should probably not get too caught up in the performance differences here unless we think that is the dominant factor. Other things might be more important such as how easy it is to exchange shared memory and waitable events (the equivalent of a windows event object). -Darin On Wed, Nov 12, 2008 at 7:07 PM, Jeremy Moskovich [EMAIL PROTECTED] wrote: Hi Dan, Looking at the current IPC behavior is definitely on my list of things to do. I don't think that should change the interpretation of the data though. According to our measurements, Mach messages are always faster *. So the question becomes, not is it faster, but by how many orders of magnitude. Best regards, Jeremy * Using inline messages there's a break even at some point as the cost of copying data takes over, but as discussed we can use OOL messages to get really fast (~30uSec) constant time sends for messages 5K. On Wed, Nov 12, 2008 at 5:08 PM, Dan Kegel [EMAIL PROTECTED] wrote: On Wed, Nov 12, 2008 at 4:46 PM, Jeremy Moskovich [EMAIL PROTECTED] wrote: We ran some benchmarks of Mach ports vs FIFOs on OSX, you can find the results in the Performance Considerations section of the Design doc. I don't see any measurements showing what typical Chrome IPC traffic looks like. Without that, it's hard to interpret your results. If I missed it, please point me to it. -- --Amanda I have never seen anything fill up a vacuum so fast and still suck. --Rob Pike on the X Win... --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Chromium-dev group. To post to this group, send email to chromium-dev@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/chromium-dev?hl=en -~--~~~~--~~--~--~---
[chromium-dev] Re: Fwd: [chromium-dev] Re: OS X IPC Design doc
On Thu, Nov 13, 2008 at 12:46 PM, Darin Fisher [EMAIL PROTECTED] wrote: Right on... We can definitely support custom implementations for each platform, and we should do that if it significantly lowers barriers and helps us get to a better product. I think it was reasonable to talk about sharing code here between Linux and OSX because this code, while small and contained, is something, much like the MessageLoop, that will invariably get a lot of attention due to very subtle issues (given our experience on Windows). It would be nice to share that kind of effort whenever it makes sense. Oh, very much agreed. And it never hurts to confirm assumptions. Jeremy's benchmark did turn up some very interesting desktop/laptop differences, which point up areas we may want to keep an eye on: effects of multiple CPU cores, memory bandwidth, etc. --Amanda --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Chromium-dev group. To post to this group, send email to chromium-dev@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/chromium-dev?hl=en -~--~~~~--~~--~--~---
[chromium-dev] Re: Fwd: [chromium-dev] Re: OS X IPC Design doc
To add to the discussion (or to the confusion) we have a similar situation of Windows. Pipes are not the fastest IPC on windows. LPC ports and in Vista ALPC ports are the fastest, then probably any homebrew shared memory scheme and then pipes. The issues are that LPC have a non officially sanctioned api, shared memory channels is the IPC used on the sandbox, its main problem is the management of arbitrary sized messages, which we don't support. At the end of the day we could add enough code to the sandbox ipc so it resembles the semantics of the pipe, but at that point the speed is no longer what it was. Now you see, all IPC is just layers of overhead on top of shared memory reads writes, but sometimes you want that overhead. No, I am not arguing in favor of the pipes per se. I just want us not to fall into the trap of poorly reinventing the desired ipc semantics. We always said that each platform port should take advantage of the native capabilities. Besides, we have layers of objects on top of the IPC, I'll be much more worried if we had to change their semantics or apis. More: at least on windows both sides of the pipe are not equal. One clearly is the server and the other is clearly the client and there is a asymmetric api and semantics; this complexifies peer-to-peer designs but it is very nice for client-server like schemes like browser- renderer. Connecting as client to a pipe implies a fair amount of trust in the server, even before the first byte is exchanged. So we need to be careful in other platforms not to weaken the master-slave nature of the chrome IPC architecture. If you this benchmark you need to be careful to do a fairly complete impl of both, we want to make sure we account the overhead that involves variable size (up to 256M on windows) packets, the duplex nature and what happens when one of the sides just dies or just decides to close the connection. -cpu On Nov 7, 8:06 pm, Linus Upson [EMAIL PROTECTED] wrote: If you want to argue using data, we'll use that. If you want to argue using opinions, we'll use mine. Linus On Fri, Nov 7, 2008 at 7:55 PM, Amanda Walker [EMAIL PROTECTED] wrote: They work fine, though using Apache as an example, a Linux box running Apache can generally handle a higher load than Mac OS X running on the same hardware (historically, this has been true for any BSD-based kernel, not just Mac OS X). That said, on modern hardware Apache is mostly limited by TCP throughput over Ethernet, not interprocess communication. 10.5 is indeed better in this regard than prior versions of Mac OS X, though the experience of other Mac client teams at Google show a similar performance differential, as do other comparisons. For a recent example, seehttp://www.cs.virginia.edu/~jom5x/papers/macos.pdf. While they didn't make a perfect apples to apples comparison, they did use comparable hardware and measured an almost 2:1 performance difference between Mac OS X and Linux when it came to latency using signals and pipes between processes. However, we can all go back and forth all day with based on prior experience, X is faster than Y vs. but it shouldn't be :-). We'll put together some performance tests that will let us do some testing of Mac OS X pipes signals, Linux pipes and signals, and Mach IPC on the same piece of hardware. Having some objective measurementa should help this discussion immensely. --Amanda On Fri, Nov 7, 2008 at 8:54 PM, Wan-Teh Chang [EMAIL PROTECTED] wrote: On Thu, Nov 6, 2008 at 2:05 PM, Amanda Walker [EMAIL PROTECTED] wrote: Linux and Darwin are only superficially similar, and the differences get larger the closer to the kernel we get. I realize I'm being repetitious here :-), but generally speaking, starting with the assumption that one technique will work on both, especially if it involves IPC, threading, or process creation, is a mistake. I'm very surprised that this is the case for Mac OS X 10.5. It is to Mac's advantage to make it easy to port Unix code to Mac OS X. If Apache can run well on Mac OS X, these common system calls should have a good implementation on Mac OS X. Wan-Teh --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Chromium-dev group. To post to this group, send email to chromium-dev@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/chromium-dev?hl=en -~--~~~~--~~--~--~---
[chromium-dev] Re: Fwd: [chromium-dev] Re: OS X IPC Design doc
They work fine, though using Apache as an example, a Linux box running Apache can generally handle a higher load than Mac OS X running on the same hardware (historically, this has been true for any BSD-based kernel, not just Mac OS X). That said, on modern hardware Apache is mostly limited by TCP throughput over Ethernet, not interprocess communication. 10.5 is indeed better in this regard than prior versions of Mac OS X, though the experience of other Mac client teams at Google show a similar performance differential, as do other comparisons. For a recent example, see http://www.cs.virginia.edu/~jom5x/papers/macos.pdf. While they didn't make a perfect apples to apples comparison, they did use comparable hardware and measured an almost 2:1 performance difference between Mac OS X and Linux when it came to latency using signals and pipes between processes. However, we can all go back and forth all day with based on prior experience, X is faster than Y vs. but it shouldn't be :-). We'll put together some performance tests that will let us do some testing of Mac OS X pipes signals, Linux pipes and signals, and Mach IPC on the same piece of hardware. Having some objective measurementa should help this discussion immensely. --Amanda On Fri, Nov 7, 2008 at 8:54 PM, Wan-Teh Chang [EMAIL PROTECTED] wrote: On Thu, Nov 6, 2008 at 2:05 PM, Amanda Walker [EMAIL PROTECTED] wrote: Linux and Darwin are only superficially similar, and the differences get larger the closer to the kernel we get. I realize I'm being repetitious here :-), but generally speaking, starting with the assumption that one technique will work on both, especially if it involves IPC, threading, or process creation, is a mistake. I'm very surprised that this is the case for Mac OS X 10.5. It is to Mac's advantage to make it easy to port Unix code to Mac OS X. If Apache can run well on Mac OS X, these common system calls should have a good implementation on Mac OS X. Wan-Teh --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Chromium-dev group. To post to this group, send email to chromium-dev@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/chromium-dev?hl=en -~--~~~~--~~--~--~---
[chromium-dev] Re: Fwd: [chromium-dev] Re: OS X IPC Design doc
If you want to argue using data, we'll use that. If you want to argue using opinions, we'll use mine. Linus On Fri, Nov 7, 2008 at 7:55 PM, Amanda Walker [EMAIL PROTECTED] wrote: They work fine, though using Apache as an example, a Linux box running Apache can generally handle a higher load than Mac OS X running on the same hardware (historically, this has been true for any BSD-based kernel, not just Mac OS X). That said, on modern hardware Apache is mostly limited by TCP throughput over Ethernet, not interprocess communication. 10.5 is indeed better in this regard than prior versions of Mac OS X, though the experience of other Mac client teams at Google show a similar performance differential, as do other comparisons. For a recent example, see http://www.cs.virginia.edu/~jom5x/papers/macos.pdf. While they didn't make a perfect apples to apples comparison, they did use comparable hardware and measured an almost 2:1 performance difference between Mac OS X and Linux when it came to latency using signals and pipes between processes. However, we can all go back and forth all day with based on prior experience, X is faster than Y vs. but it shouldn't be :-). We'll put together some performance tests that will let us do some testing of Mac OS X pipes signals, Linux pipes and signals, and Mach IPC on the same piece of hardware. Having some objective measurementa should help this discussion immensely. --Amanda On Fri, Nov 7, 2008 at 8:54 PM, Wan-Teh Chang [EMAIL PROTECTED] wrote: On Thu, Nov 6, 2008 at 2:05 PM, Amanda Walker [EMAIL PROTECTED] wrote: Linux and Darwin are only superficially similar, and the differences get larger the closer to the kernel we get. I realize I'm being repetitious here :-), but generally speaking, starting with the assumption that one technique will work on both, especially if it involves IPC, threading, or process creation, is a mistake. I'm very surprised that this is the case for Mac OS X 10.5. It is to Mac's advantage to make it easy to port Unix code to Mac OS X. If Apache can run well on Mac OS X, these common system calls should have a good implementation on Mac OS X. Wan-Teh --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Chromium-dev group. To post to this group, send email to chromium-dev@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/chromium-dev?hl=en -~--~~~~--~~--~--~---
[chromium-dev] Re: Fwd: [chromium-dev] Re: OS X IPC Design doc
It sounds like things are still fairly speculative... Since Linux and Darwin are so similar, it seems like it would be very nice to share code. Have you looked at using unix-domain sockets and sendmsg to achieve the equivalent of DuplicateHandle? Since we always use DuplicateHandle in one direction, to map a local handle into a representation that the renderer understands (due to sandbox restrictions), perhaps we could have a special IPC channel that we use to implement the equivalent of DuplicateHandle. The renderer could have a dedicated thread for this purpose to ensure that it happens with low latency. Or maybe there is some other canonical way of simulating DuplicateHandle? -Darin On Thu, Nov 6, 2008 at 5:56 AM, Amanda Walker [EMAIL PROTECTED] wrote: Re-forwarding to chromium-dev, since I sent it from the wrong address last night :-/ -- Forwarded message -- From: Amanda Walker [EMAIL PROTECTED] Date: Wed, Nov 5, 2008 at 10:07 PM Subject: Re: [chromium-dev] Re: OS X IPC Design doc To: chromium-dev@googlegroups.com Since I'm the one who suggested mach IPC to jeremy, I should chime in here. When we looked at the Windows IPC code (the set of capabilities that the browser uses to communicate with the renderers, and the renderers use to communicate with plugins, not just the IPC classes per se), we identified several capabilities that Chrome currently relies on: - Bidirectional message passing. While this is built on top of a named pipe in Windows, it also uses the ability to duplicate a handle (a blob of memory) into another process and pass a reference to it as part of a message, without having to explicitly map and unmap shared memory to do so. - shared memory. For shared bitmaps, this is used fairly conventionally, but some other uses (such as visited link processing and greasemonkey support) apply the ability to pre-emptively map a segment into another process's address space. - arbitrary-sized messages - high performance - leaves no footprints in the file system The most obvious way to provide most of those capabilities in a UNIX process is via system V shared memory and IPC, with mmap() and UNIX domain sockets a close second. Both of these mechanisms will work fine on the Mac (with a slight preference towards the latter for performance, even though they will leave footprints in the file system if we're not careful). However, the closest fit semantically to the whole set of capabilities that Chrome currently uses are the Mach IPC and VM APIs, which are generally lighter weight and more flexible, but less portable. I expect that the linux implementation will run on a Mac, but I would also like to investigate implementations directly on top of Mach APIs--especially for tasks (like resource loading) that involve large, variable-sized payloads. The ability of Mach IPC to efficiently map such payloads among multiple processes without having to write them out to the file system or keep track of ownership of anonymous mmap segments is something I think is very attractive, since it gives us the performance win without the additional complexity compared to the Windows implementation. We're looking at the same sort of criteria that caused Apple to implement DO (Distributed Objects) and other mac OS X system services via mach rather than UNIX APIs that are shared with Linux. Now, all that said, I do think it's important to do some concrete comparisons, and it's always nice to share code if there's not a large performance (or other) cost. But given the small amount of code involved, I'd like to focus on exploring the most straightforward implementation on each platform, without using share this implementation as an up front constraint. In the extreme case, of course, the Mac should be able to build and run the Linux version, X11 and all :-). That doesn't mean that's the best approach. I think that the earlier suggestion of writing some timing tests for the IPC modules is a great idea, since it'll let us make some of those concrete comparisons. I think of it as analogous to other areas where we are using native Mac APIs in the Mac implementation of some modules even though there are roughly equivalent UNIX APIs that could also be used with more effort: high resolution timers and time/date conversion are one recent example. Some of avi's recent work with SSL integration is another. Does that help give some background? --Amanda On Wed, Nov 5, 2008 at 8:26 PM, Darin Fisher [EMAIL PROTECTED] wrote: Sorry to be so persistent, but I don't understand why you need those things. Can you provide some specific examples? As far as I know, we need the ability to have shared memory. It seems like we can do that with mmap. We need a way to have shared waitable events (like windows event objects), and that can be done using a connected anonymous pipe. In both of those cases, we have something that is a file
[chromium-dev] Re: Fwd: [chromium-dev] Re: OS X IPC Design doc
On Thu, Nov 6, 2008 at 2:05 PM, Amanda Walker [EMAIL PROTECTED] wrote: On Thu, Nov 6, 2008 at 4:45 PM, Darin Fisher [EMAIL PROTECTED] wrote: It sounds like things are still fairly speculative... Well, performance differences are not speculative, though we don't know what the effect on Chromium would be. Since Linux and Darwin are so similar, it seems like it would be very nice to share code. Linux and Darwin are only superficially similar, and the differences get larger the closer to the kernel we get. I realize I'm being repetitious here :-), but generally speaking, starting with the assumption that one technique will work on both, especially if it involves IPC, threading, or process creation, is a mistake. While I have some bias from personal experience, this issue comes up again and again in places like the darwin-dev mailing list, where X works fine on my Linux box, why doesn't work well on the Mac? might as well be a FAQ. I agree that it would be very nice to share code. We have to write a pipe/socket based implementation for Linux anyway, so I'm not arguing against that at all. I'm suggesting that we also do a bake-off of that against native IPC on the Mac, and make a decision based on objective data. We're talking about a small amount of code, so the benefit of doing so should greatly outweigh the cost. --Amanda I don't object to a bake off. It just seemed like we were skipping ahead to just do an OSX-specific solution first, leaving the bake off for later (or never). Did I misunderstand? -Darin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Chromium-dev group. To post to this group, send email to chromium-dev@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/chromium-dev?hl=en -~--~~~~--~~--~--~---