Re: Iptable rules and MARK values for sessions

Daniel Wagner Thu, 07 Nov 2013 05:41:06 -0800

Hi Glenn,

[Sorry for the late response. I am kind of pretty busy]


On 11/05/2013 08:27 PM, Glenn Schmottlach wrote:

As I understand it, Connman sessions are tracked, in part, by marking
packets associated with the session's UID, GID, or SELinux context
information. This is translated into iptable rules to which "mark" the
connections as described in session-overview.txt.

Per session iptables rules:

iptables -t mangle -A OUTPUT -m owner [--uid-owner|--gid-owner] $OWNER \
   -j MARK --set-mark $MARK

iptables -t filter -A INPUT -m mark --mark $MARK \
-m nfacct --nfacct-name session-input-$MARK
iptables -t filter -A OUTPUT -m mark --mark $MARK \
-m nfacct --nfacct-name session-output-$MARK



BTW, the nfacct is going away. We are going to use NFQUEUE in future.
Though we still need the MARK unless we can convince the netdev
guys that the lookup for a policy routing table could be something else
e.g. cgroup id. For the time being I keep the assumption we need
the marker.


I'm curious how you anticipate using NFQUEUE in the future unless
you're considering re-directing the network traffic to a user-space
application (like Connman). Are you trying to do a transparent proxy
or intercept the packets in some way? If so, have you considered
TPROXY or even using NFLOG to get at the these packets? My only
concern/confusion here is the intent for using NFQUEUE since it seems
you want to funnel matching packets into user-space. Can you elaborate
on your plans? Wouldn't this have a negative impact on the performance
of the connections managed by Connman if you are moving in that
direction?


The Session API is also designed to allow per application statistics
and routing. The current implementation is trying to use NFACCT
for the statistics. That doesn't really scale though. We need one
rule per direction and protocol type per application. Furthermore,
we need to poll for the updates.

Android supports the same feature via xt_qtaguid. There was a good
presentation during the LPC on this topic:

http://www.youtube.com/watch?v=Fi_iyaF7Gw0

Eric Dumazet and Thomas Graf were attending this session and both
said NFQUEUE is what we should use for this kind of feature we
want to support. They also pointed out that the operation is
dirt cheap. Let's see, at least we have them on tape and can
bother them to fix it if it isn't :D (just jocking).

I and a colleague of mine already working on this. We need
to extend the connection tracking information with skuid
and skgid. We create a 'statistic daemon' to collect the data.
The main reason not add it to ConnMan directly is that
libnetfiler_queue (and friends) have a blocking API which
is completely unfit for ConnMan. Besides this I don't think
it is a great idea to have ConnMan to look at all packets.
I hope to have soon some real numbers on performance
and power impact.

These rules instruct iptables to collect basic usage metrics
(bytes/packets sent/received) on behalf of Connman.

I have a requirement to do additional filtering and statistics
collection using my own set of rules and would like to filter on these
same MARK'ed packets while adding additional "marks" (or fields) to
the packets so that I can separate them into additional categories
needed by my application. This is where I need some additional
information from Connman and a slight modification to how packets are
MARK'ed and compared.

As I understand it, each netfilter packet has an associated 32-bit
mark value which is being set to $MARK in the above rules. When
setting a MARK value you can provide a mask which will only allow
specific bits to be set. Looking at the session.c code (line 39) I see
that new sessions are initialized starting at 256 and are incremented
upwards from there. So my assumption is that Connman assumes it "owns"
the upper 24-bits of the netfiler mark value. Likewise, it would seem
from the iptables rules that it does not expect other rules in the
iptable to use these same bits for marking packets.



ConnMan starts counting at 256 because that is the first 'unmanaged'
ID for a policy routing table. To simplify I used the same value
for the routing table as for the marker. Obviously the current
implementation doesn't really play nicely by assuming it has complete
control. I have no problem to change that.

What I am proposing is the following:

1) Modify Connman to provide a mechanism to retrieve the iptable
"MARK" value associated with a session as well as the bitmask used to
filter/extract those bits from the underlying U32 netfilter value.
Perhaps both the session MARK value and bitmask could be considered
read-only "settings" of the session object and would be updated when
the iptable rule is written.



I don't really follow here. Could you please elaborate it a bit?


Let me try to explain what I'm trying to accomplish. On my target
platform there exist applications that run as separate processes. My
intention is to run each "app" as a separate Linux user.


Similar to what we want to do (and Android does)

These apps
might still be further divided into groups with similar features (e.g.
streaming apps, music apps, etc...). My thought was to assign each app
to a Linux group, or as you indicate, perhaps a specific cgroup if I
utilize that Linux feature. I need to enforce both a per-application
network data quota (bytes sent + received < data limit) as well as an
overall application "group" data limit.


I think we share the same use case here. Apart of the grouping of several
application I want to support the same use case

My thought is that I could create a Connman policy (or session
configuration) for *each* application. Connman seems to already
provide the mechanism to provide a prioritized Bearer list and the
necessary indications to tell an application whether the underlying
Bearer/service is connected (or not) and to request a connection
through the Session API.


Yes, the Session API is not only there for notification.

It seems that when a Session is created new
rules are added to the iptable to "mark" these packets from a
particular application based on the UID, GID, or SELinux context.


Yes.

I want to set up additional filters on these same marked packets
except I also want to add a field to mark a GID association in the
same U32 netfilter value. I need to know the netfilter mask I can use
to mask off the fields Connman uses for each packet. My thought was to
do this outside of Connman since it's not clear this is something
everyone needs/desires in their architectures.


Maybe. Could you list your rules you want to use?

Also, I want to use
netfilter to collect metrics for a combined send/receive byte count
since currently Connman tracks sent/received bytes separately. So in
my solution, I add additional rules for each app (using the Connman
"mark" value) to collect a send/receive application statistic. I also
use my own netfilter "mark" value to track the application's Linux
"group" so I can do group statistics and impose a quota at that level
as well (which could override the quota at the application level).


This sound pretty reasonable. I don't see why something like this
should be supported upstream by ConnMan. Note we completely lack
code for supporting the quota per application so far.

I
can do all of this directly with iptables (I've prototyped it by hand)
but need Connman to provide me with the per-session/application "mark"
value and the filter I can used to mask off that value for my own
iptable rules. Retrieving this information as a read-only Session
property seemed the most convenient way. I'm open to other suggestions
however.


I would suggest to we join forces here and add the needed code
directly to ConnMan or the 'statistic daemon'.

2) Provide a configuration option (either at compile-time or via
configuration file item), that specifies the offset/mask for the
session's MARK value. For instance, currently the inferred offset (by
examining the code) is 8 bits with 0xFFFFFFFF as the default mask.
This doesn't allow the Connman markings to co-exist with any other
external markings. For my application I would like to reserve the
upper 16 bits for the Connman session value and the lower 16 bits for
my own use. The Connman code that generates the iptables rule might be
modified to look like this:

iptables -t mangle -A OUTPUT -m owner [--uid-owner|--gid-owner] $OWNER \
   -j MARK --set-mark $MARK/0x0000FFFF

iptables -t filter -A INPUT -m mark --mark $MARK/0xFFFF0000 \
-m nfacct --nfacct-name session-input-$MARK
iptables -t filter -A OUTPUT -m mark --mark $MARK/0xFFFF0000 \
-m nfacct --nfacct-name session-output-$MARK



That sounds reasonable to me. I don't think we need 2^24 ids, we
should be happy with 2^16 applications :)


I thought so too . . . I can't imagine (at least for the near-term)
having more the 2^16 apps installed. In the process of testing how
Sessions work in Connman I believe I've stumbled upon either a bug in
the Sessions implementation or an intended (purposeful) behavior.


The code is pretty new, so it is quite likely you find bugs.

I
rather hope it's a bug since otherwise it will force me to put a
"shim" in front of the Connman Session API. What I have observed is
that for *every* session that is created, a new set of iptable rules
is created . . . even if the same user/group/SELinux context is used
to create the session.


Yeah, the assumption is that each application uses it's own unique
identifier. That code doesn't check if more then one Session uses
the same identifier (GID). So this is a bug.

For instance, if I have two applications (both
the same user - with a session policy keyed to the UID) and they
create a Session instance then in the iptable's "mangle" table
(OUTPUT) looks something like this:

Chain connman-OUTPUT (1 references)
  pkts bytes target     prot opt in     out     source
destination
    20  1134 MARK       all  --  *      *       0.0.0.0/0
0.0.0.0/0            owner UID match 1000 MARK set 0x100
    20  1134 MARK       all  --  *      *       0.0.0.0/0
0.0.0.0/0            owner UID match 1000 MARK set 0x101

So it would appear the *last* session that is created will effectively
mark (and clobber) the first marking for the packet (0x101 mark
clobbers the earlier 0x100 mark since it's last in the chain). I hope
this is *not* what was intended. I expected that iptable rules for
Session policies for the same UID/GID/SELinux context would "share"
the same session (and thus statistics). So if two applications running
as the same UID/GID/SELinux context requested a Connman Session, the
first requestor would create it while the second application would get
a "reference" to that session. This session would remain valid until
the last application (with the same UID/GID/SELinux context) destroyed
the session. In a sense the Sessions become reference counted per
session policy. As it stands today it would appear Connman creates far
more iptable rules than necessary to support multiple applications
running as the same user (or in the same group or SELinux context). Am
I correct in assuming this is a design error in the current
implementation of Sessions?


You are right. We need to fix this. Patches? :D

As you can see, setting the Connman MARK is no longer destructive to a
value that might already exist in the lower 16-bits of the 32-bit
word. Likewise, when reading and comparing the value the lower 16-bits
of the 32-bits value are effectively masked out so it won't impact the
Connman session match test.

Part of my problem is that I need to create additional iptable rules
that place data-caps (e.g. a quota) on certain sessions (apps) which
are then optionally partitioned into additional groups. I can use the
power/flexibility of iptables to implement this logic but I need to
cooperatively use the same netfilter 32-bit value to store additional
information about the connection and their associated packets. This
change would help Connman behave as a better netfilter citizen while
allowing me to piggy-back on top of the existing marking done by
Connman.



Well, the quota thing is something we want also to support natively
with sessions. That is also why count the packets (currently with
nfacct but we don't use that information yet).


I'm sure you realize that you don't explicitly need nfacct to collect
statistics. Granted, nfacct gives you nicely named buckets you can
access with the nfacct utility and likely over netlink as well. But
each iptable's rule has a byte/packet count associated with it.


Dumping the iptables all the time is kind of excessive.

So if
a packet merely passes through a rule (a simple "do nothing" rule) it
will increment that rule's counters. Plus, iptables gives give you a
way to query/reset the stats for a specific rule if I'm not mistaken.


Yes, that would probably also work but this is a polling which I
try to avoid.

I'm willing to make the necessary code changes to implement this
solution. I'm just seeking feedback of whether a better solution
exists (that I'm unaware of) or perhaps recommendations on my
approach. Of course any patches I make would be available to the
community and hopefully could eventually be integrated into the main
baseline.



I am happy to see patches!

cheers,
daniel


I'm really interested in the direction you're headed with Connman
Sessions. I'd like to leverage this work and perhaps contribute the
project's generic parts for the solutions I devise. I realize not
everyone has similar needs but it would appear (at least so far)
Connman's goals are fairly closely aligned with mine.


That is great to hear. We really should strive to make it useful for
more than only my use cases. Therefore I am really happy if you
join the effort to push forward with the Session API and make it
a great feature!

cheers,
daniel
_______________________________________________
connman mailing list
connman@connman.net
https://lists.connman.net/mailman/listinfo/connman

Re: Iptable rules and MARK values for sessions

Reply via email to