Hello Peter, first of all, thanks for your opinions.
> One of the problems you will run into here is that it is hard to convert raw > data into specific packets without many pre-configured assumptions or just > plain guesswork. > For example, if you can detect two distinct blobs from two fingers, this may > be two fingers from the same hand or two fingers from two different hands. In > both cases, the decision on how these packets are divided or linked together > require. It gets even worse if you account for hovering items in the detection > (e.g. you can see the finger touchpoint _and_ parts of the hand hovering) In fact, I think that this even a point in favor of the layered approach. You are of course correct (and Jim Gettys too, he also mentioned this). However, I believe that the lowest layer is in the best position to know what its hardware is capable of and to use these assumptions to generate an abstraction of the input data. > The transformation into screen coordinates is of little issue. Ideally, you'd > want applications using multi-touch stuff to be aware of the events anyway, > in which case you'd just use the device coordinate space. What if, for example, you have a camera-based input device with a fisheye lens? I can't imagine that every frontend should do the radial undistortion itself.. > Much more important is the transformation of the blobs into a polar coordinate > system (e.g. is it a diagonal blob or a thumb, rotated by 30 degrees?). Where > are you planning on doing this? The position protocol delivers the major and minor axis vectors of an equivalent ellipse; is this what you are thinking about? > You cannot easily separate the interpretation layer from the previous two > layers as only the interpretation layer can know whether something is a > "pinch" gesture or just two users happen to move into the same direction with > their fingers. You need something as close to raw data in this layer as > possible. Well, the position protocol does allow for this distinction, as long as the hardware is actually capable of sensing the difference. Every position object (*) has a "parent id" field, so in your example, a pinch gesture would only be triggered on fingers with the same parent id. If the hardware can't distinguish the two cases, then the parent id should, e.g., always be 0xDEADBEEF (or whatever), and everything works as expected, too. However, maybe you don't actually want to prevent two people from scaling something together - just a thought. > Dividing detection and interpretation into distinct layers looks good on paper > but it becomes hard to do anything useful. OTOH, merging them into one layer > looks good on paper but but it becomes hard to do anything useful. :) *sigh* Nicely put :-) Yours, Florian (*) Note that I don't say "position packet". I'd like to counter the assumption that the "plaintext-over-UDP" method is the only way of delivering events. I'm happy with XML, too :-) -- 0666 - Filemode of the Beast _______________________________________________ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg