Re: [squid-dev] [PATCH] invalid certificates and spliced connections

2014-12-26 Thread Alex Rousskov
On 12/22/2014 08:24 AM, Tsantilas Christos wrote:
> On 12/19/2014 10:03 PM, Amos Jeffries wrote:

>> Possibly a fast ACL is appropriate:
>> ssl_bump_error allow/deny [acls...]
>>
>> Which is run to make the above decision. ONLY in the event that TLS
>> protocol syntax errors or malformations.  Not for cert/cipher/option
>> issues such as bad combinations of valid things, or insecure settings.

I predict it is going to be increasingly difficult to draw the line
around "protocol syntax errors or malformations" to deterministically
isolate those problems from everything else, including "bad combinations
of valid things, or insecure settings". There is usually a significant
gray area between "malformed" and "valid", where the decision depends on
the deployment environment and business logic.

We already have a very flexible certificate validator API and an
ACL-controlled SSL error handling logic. That ought to be sufficient as
far as allowing or denying SSL errors is concerned.

We lack configuration options to control honored error handling, but
that is a different problem. The default "terminate server and bump the
client to serve the error message" behavior suggested by Christos is
secure. It will not satisfy everybody; so future configuration options
will make that behavior configurable.


> I believe it is OK to let this patch as is for now.
> This patch will abort imediatelly:
>  - If no certificate found and checked using squid ssl verify procedure
> (mostly because the server response was malformed, or unsupported)
>  - if a server certificate validation failed
> 
> I believe this is handles most of the cases. The cipher or option issues
> may still be ignored, but this errors may caused because of unsupported
> features by openSSL library we are using.
> But still we have verify server certificates, and this is should be
> enough for security.


I agree with Christos here. SslBump configuration is already quite
complex. We should not add more configuration options unless absolutely
necessary. The default behavior that Christos is proposing solves the
security problem at hand. Let's postpone adding more options until there
are specific use cases that require them.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Non-HTTP bypass

2014-12-30 Thread Alex Rousskov
On 10/21/2014 11:29 AM, Tsantilas Christos wrote:
>>>- Adds "on_first_request_error", a new ACL-driven squid.conf
>>> directive
>>> that can be used to establish a blind TCP tunnel which relays all bytes
>>> from/to the intercepted connection to/from the intended destination
>>> address. See the sketch above.
>>> The on_first_request_error directive supports fast ACLs only.

>> What a nasty name for an access list.

Please try to be more specific, especially when you are being negative
about others work. "Nasty" can mean many things to many people. In this
particular case, your comment carries no [positive] value at all. It is
difficult for the code author to make something "less nasty" because
they do not know what "nasty" means to you in this context.

BTW, on_first_request_error does not name an access list.


>> I believe tcp_tunnel or tunnel_transparent would be more descriptive of
>> the test is *checking* for.

* If we want to tell Squid when to tunnel, then tcp_tunnel is a good
solution (but this is not what this proposal tries to do).

* If we want to tell Squid what to do with connections it does not
recognize, then tcp_tunnel is the worst alternative of the three I can
think of (see below for details).


The "tcp_tunnel" idea currently lacks context: Is it applied when we
accept a connection, when we authenticate the user, when we parse 10th
request, etc.? There are many very different cases where we may want to
start tunneling, for many different reasons. This lack of specifics
makes the idea look simple and, hence, appealing, but the devil is
usually revealed in the details.

The implemented "on_first_request_error" feature is much smaller and has
relatively narrow context: When Squid realizes that it does not
recognize the connection, it consults the proposed directive for
instructions. The "on_error" concept is not new in IT and should be
familiar to many admins.


The suggested feature can be converted into a tcp_tunnel feature if:

* we add a hard-coded first_request_error ACL

* agree that tcp_tunnel rules are evaluated when Squid realizes that it
does not recognize the first request on a connection (at least).

Then, instead of writing:

  on_first_request_error tunnel some
  on_first_request_error respond other
  on_first_request_error future_action yet_another

folks will write:

  tcp_tunnel on_first_request_error some
  respond_with_error on_first_request_error other
  future_action on_first_request_error yet_another

which is clearly (I hope!) worse from configuration clarity and
convenience point of view. However, folks would also be able to write:

  tcp_tunnel some_other_error some
  respond_with_error some_other_place other
  future_action some_other_category yet_another

which is an improvement over the more rigid on_first_request_error
scheme that restricts application to only certain kinds of errors in
certain places.


The design space can be summarized like this:

  # extreme left: general-purpose on-error handling:
  on_error  error_kind_acl acl ...

  # proposed feature: unrecognized connection handling;
  # the error_kind_acl is implicit in this design:
  on_first_request_error  acl ...

  # suggested extreme right: general-purpose actions:
  tcp_tunnel error_detection_acl acl ...
  ssl_tunnel error_detection_acl acl ...
  respond_with_error error_detection_acl acl ...
  terminate error_detection_acl acl ...


Since many Squid actions are likely to be appropriate under the same
conditions (tunnel, respond with an error, terminate, ssl-tunnel, etc.)
I do not think the suggested extreme is the best design. We should
either move left to the general on_error configuration or stay with the
proposal currently under review.


>> The "only checked on first request failure" detail can be left in the
>> directive documentation.

It would be wrong to provide a general-looking "tcp_tunnel" option that
can only be applied in one specific case. The initial implementation may
be restricted to one case, of course, but a general feature needs enough
knobs to be usable in many contexts. In this case, this means a group of
hard-coded ACL guards that only match in some specific supported
contexts (see above for examples). At the end of your email, you have
already identified an error-unrelated case where the tcp_tunnel
directive would be required...


>> I also suspect that is a false statement since
>> anything we do with a CONNECT body that fails will need a matching check
>> applied.

Not sure what you mean here, but please note that the first request
inside a bumped CONNECT tunnel is still the first request as far as the
proposed feature is concerned.

To avoid misunderstanding, I am not saying that on_first_request_error
name or its description is perfect. Better alternatives are more than
welcome (and similar names/descriptions would be needed even if we move
towards a more general on_error design)!


>> ssl-bump calls its version of tunnelling "ssl_bump splice". So if we are
>> c

Re: [squid-dev] [PATCH] Non-HTTP bypass

2014-12-30 Thread Alex Rousskov
On 12/30/2014 06:19 PM, Amos Jeffries wrote:
> On 31/12/2014 7:30 a.m., Alex Rousskov wrote:
>> On 10/21/2014 11:29 AM, Tsantilas Christos wrote:
>>>>> - Adds "on_first_request_error", a new ACL-driven squid.conf 
>>>>> directive that can be used to establish a blind TCP tunnel
>>>>> which relays all bytes from/to the intercepted connection
>>>>> to/from the intended destination address. See the sketch
>>>>> above. The on_first_request_error directive supports fast
>>>>> ACLs only.


> IMO this directive needs to be friendly and clear as its going to be
> somewhat popular.

Agreed. I am not going to respond to most of your thoughts regarding the
directive name. Suffice to say that I disagree with most of your
analysis of natural language issues, but do not have enough cycles to
engage you on that exciting topic. For now, let's focus on the proposed
feature itself and decide on the final directive(s) naming later.

Most of my post was not about naming at all. It was about three ways to
control non-HTTP bypass functionality. In your response, you have not
addressed those points at all, so I assume you interpreted them as a
discussion about naming, which rendered my last email mostly useless.
However, since you seem to be OK with the updated patch going in with
some to-be-determined directive name, perhaps we are making progress
after all!


> The "on_error" as you say is already fairly well known. In particular
> its known as an event handler hook in JavaScript and other languages.
> Like I said above squid.conf is more a declarative language than those.


Well-known event handlers use (and often combine) two very different
concepts:

* A declaration that an error is handled by a named handler:
  on_error=foo

* An inline implementation of an anonymous handler:
  on_error="for (i = 0; i < count; ++i) { foo(i); ..."


ACL-driven directives in Squid essentially combine handler declarations
with a bunch of if statements similar to an inline implementation:

  on_error="if acl1 then foo1; elsif acl2 then foo2; else foo3;"

What Squid uses is more declarative than some well-known on_error
handler approaches and less declarative than others. Even if it were
possible, deciding the exact degree of "declarativeness" is not going to
help us make progress here IMO.

What would help is to decide whether we want to focus on

  A) multiple conditions for establishing a TCP tunnel;
  B) multiple ways to handle an unrecognized protocol error; OR
  C) multiple ways to handle multiple errors.

IMO, we want (B) or perhaps (C) while leaving (A) as a separate
out-of-scope feature.

The proposed patch implements (B). To implement (C), the patch needs to
add an ACL type to distinguish an "unrecognized protocol" error from
other errors.


> 2) "first_request_error" [...] is actually
> declaring an action to perform *instead* of producing errors

No. It declares actions to perform when an error is detected. Those
actions may and often do include "producing an error" to the user, of
course. The fact that there may be many different actions is critical
here because the tcp_tunnel alternative does not handle multiple actions
well -- it focuses on one "tunnel" action only.


> 3) This directive is not relevant to HTTP, but to SSL/TLS.

Not exactly. The feature deals with protocols that Squid does not
recognize. We may deal with plain and encrypted HTTP connections today,
but we will probably have to deal with FTP, Web Sockets, and other
protocol connections tomorrow. SSL/TLS is just a small slice of that puzzle.


> Coming back to this now (and accepting your scope narrowing from tcp_
> to ssl_) we should probably call this ssl_unknown_protocol. Since
> anything on port 443 which is syntactically HTTP but has some
> invalidity is rightfully responded with an error, no need to ask the
> admin at that point.

We do _not_ want to limit this feature to receiving a non-SSL/TLS
protocol on an intercepting https_port. We want to support, at least:

* non-HTTP traffic on intercepted http_port
* non-SSL/TLS traffic on intercepted https_port
* non-HTTP traffic inside successfully bumped http_port CONNECT tunnel
* non-HTTP traffic inside successfully bumped https_port SSL/TLS tunnel


Moreover, if I am interpreting your "syntactically HTTP but invalid"
condition correctly, then many admins will disagree with the universal
rightfulness of sending the user an error under those conditions.


> PS. can we please agree to start using tls_ instead of ssl_? even
> SSLv3 is a dead duck now and everybody knows it. At least on new
> directives.

I am glad SSLv3 is dying, but the degree of SSL death, and what "SSL"
means to admins is not relevant to this feature AFAICT: The feature is
not spec

Re: [squid-dev] unsupported protocol classification

2014-12-31 Thread Alex Rousskov
[ I am changing the Subject line for this sub-thread because this new
discussion is not really relevant to the unsupported protocol bypass
feature, even though that bypass feature will be used by those who need
to classify unsupported protocols. ]


On 12/31/2014 03:33 AM, Marcus Kool wrote:

> The current functionality of filtering is divided between Squid itself and
> 3rd party software (ICAP daemons and URL redirectors).

... as well as external ACLs and eCAP adapters.


> I plea for an interface where an external helper can decide what to do
> with an unknown protocol inside a tunnel because it is much more flexible
> than using ACLs and extending Squid with detection of (many) protocols.

I doubt pleading will be enough, unfortunately, because a considerable
amount of coding and design expertise is required to fulfill your dream.
IMO, a quality implementation would involve:

1. Encoding the tunnel information (including traffic) in [small]
HTTP-like messages to be passed to ICAP/eCAP services. It is important
to get this API design right while anticipating complications like
servers that speak first, agents that do not send Hellos until they hear
the other agent Hello, and fragmented Hellos. Most likely, the design
will involve two tightly linked but concurrent streams of adaptation
messages: user->Squid->origin and origin->Squid->user. Let's call that
TUNMOD, as opposed to the existing REQMOD and RESPMOD.

2. Writing adaptation hooks to pass tunnel information (using TUNMOD
design above) to adaptation services. The primary difficulty here is
handling incremental "give me more" and "give them more" decisions while
shoveling tunneled bytes. The current tunneling code does not do any
adaptation at all so the developers would be starting from scratch
(albeit with good examples available from non-tunneling code dealing
with HTTP/FTP requests and HTTP/FTP responses).

3. Implementing more actions than the already implemented "start a blind
tunnel" and "respond with an error". The "shovel this to the other side
and then come back to me with the newly received bytes" action would be
essential in many production cases, for example.

The above is a large project. I do not recall any projects of that size
and complexity implemented without sponsors in recent years but YMMV.


Please note that modern Squid already has an API that lets 3rd party
software to pick one of the supported actions. It is called annotations:
External software sends Squid an annotation and the admin configures
Squid to do X when annotation Y is received in context Z.


> A while back when we discussed the older sslBump not being able to cope
> with Skype I suggested to use ICAP so that the ICAP daemon receives a
> REQMOD/RESPMOD message with CONNECT and intercepted content, which also is
> a valid option for me.

Yes, ICAP/eCAP is the right direction here IMO, but there are several
challenges on that road. I tried to detail them above.


HTH,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Non-HTTP bypass

2014-12-31 Thread Alex Rousskov
On 12/31/2014 03:33 AM, Marcus Kool wrote:
> On 12/31/2014 05:54 AM, Alex Rousskov wrote:
>> What would help is to decide whether we want to focus on
>>
>>A) multiple conditions for establishing a TCP tunnel;
>>B) multiple ways to handle an unrecognized protocol error; OR
>>C) multiple ways to handle multiple errors.
>>
>> IMO, we want (B) or perhaps (C) while leaving (A) as a separate
>> out-of-scope feature.
>>
>> The proposed patch implements (B). To implement (C), the patch needs to
>> add an ACL type to distinguish an "unrecognized protocol" error from
>> other errors.


> From an administrators point of view, the admins that want Squid to
> filter internet access, definitely want (B).  They want (B) to block
> audio, video, SSH tunnnels, VPNs, chat, file sharing, webdisks and all
> sorts of applications (but not all!) that use port 443.

Agreed, except this is not limited to port 443. The scope includes
intercepted port 80 connections and even CONNECT tunnels.


> Basically
> this means that admins desire a more fine-grained control about what to
> do with each tunnel.

There are two different needs here, actually:

1. A choice of actions (i.e., "what to do") when dealing with an
unsupported protocol. Currently, there is only one action: Send an HTTP
error response. The proposed feature adds another action (tunnel) and,
more importantly, adds a configuration interface to support more actions
later.

2. A way to further classify an unsupported protocol (i.e.,
"fine-grained control"). I started a new thread on this topic as it is
not about the proposed bypass feature.


Cheers,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] unsupported protocol classification

2014-12-31 Thread Alex Rousskov
On 12/31/2014 10:39 AM, Marcus Kool wrote:
> On 12/31/2014 02:23 PM, Alex Rousskov wrote:
>> 2. Writing adaptation hooks to pass tunnel information (using TUNMOD
>> design above) to adaptation services. The primary difficulty here is
>> handling incremental "give me more" and "give them more" decisions while
>> shoveling tunneled bytes. The current tunneling code does not do any
>> adaptation at all so the developers would be starting from scratch
>> (albeit with good examples available from non-tunneling code dealing
>> with HTTP/FTP requests and HTTP/FTP responses).


> It can be simpler.  TUNMOD replies can be limited to
> DONTKNOW - continue with what is happening and keep the TUNMOD server
> informed
> ACCEPT - continue and do not inform the TUNMOD server any more about
> this tunnel
> BLOCK - close the tunnel

Yes, of course, but supporting just the bare minimum above and
supporting it well is 80-90% of the total effort IMO.


> I think there is no need for adaptation since one accepts a webdisk, voice
> chat, VPN or whatever, or one does not accept it. So adaptation as is
> used for HTTP, is not an important feature.

In Squid terminology, "adaptation" includes passive, read-only
inspection. However, I am sure some folks would want to send error
messages in unsupported-by-Squid protocols as well. That would require
injecting content received from a TUNMOD service, just like the current
REQMOD and RESPMOD services inject HTTP and FTP error messages.


Happy New Year,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] adapting 100-Continue / A Bug 4067 fix

2014-12-31 Thread Alex Rousskov
On 11/09/2014 02:02 PM, Tsantilas Christos wrote:

>  void
>  Http::Server::processParsedRequest(ClientSocketContext *context)
>  {
> +if (!buildHttpRequest(context))
> +return;
> +
> +if (Config.accessList.forceRequestBodyContinuation) {
> +ClientHttpRequest *http = context->http;
> +HttpRequest *request = http->request;
> +ACLFilledChecklist 
> bodyContinuationCheck(Config.accessList.forceRequestBodyContinuation, 
> request, NULL);
> +if (bodyContinuationCheck.fastCheck() == ACCESS_ALLOWED) {
> +debugs(33, 5, "Body Continuation forced");
> +request->forcedBodyContinuation = true;


The HTTP code above sends 100-Continue responses to HTTP GET messages
unless the admin is very careful with the ACLs. This can be reproduced
trivially with

  force_request_body_continuation allow all

We should not evaluate force_request_body_continuation if the request
does not have a body IMO. The force_request_body_continuation
documentation makes that option specific to upload requests. If you
agree, please adjust the committed code accordingly.


The similar FTP check seems to be inside the upload-specific code and,
hence, should not need additional "do we expect a body?" guards.


Thank you,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Non-HTTP bypass

2015-01-02 Thread Alex Rousskov
On 01/02/2015 09:16 AM, Marcus Kool wrote:
> On 12/31/2014 02:31 PM, Alex Rousskov wrote:
>> On 12/31/2014 03:33 AM, Marcus Kool wrote:
>>> On 12/31/2014 05:54 AM, Alex Rousskov wrote:
>>>> What would help is to decide whether we want to focus on
>>>>
>>>> A) multiple conditions for establishing a TCP tunnel;
>>>> B) multiple ways to handle an unrecognized protocol error; OR
>>>> C) multiple ways to handle multiple errors.
>>>>
>>>> IMO, we want (B) or perhaps (C) while leaving (A) as a separate
>>>> out-of-scope feature.
>>>>
>>>> The proposed patch implements (B). To implement (C), the patch needs to
>>>> add an ACL type to distinguish an "unrecognized protocol" error from
>>>> other errors.
>>
>>
>>>  From an administrators point of view, the admins that want Squid to
>>> filter internet access, definitely want (B).  They want (B) to block
>>> audio, video, SSH tunnnels, VPNs, chat, file sharing, webdisks and all
>>> sorts of applications (but not all!) that use port 443.
>>
>> Agreed, except this is not limited to port 443. The scope includes
>> intercepted port 80 connections and even CONNECT tunnels.

> If CONNECT tunnels are in scope, then so are all the applications that
> use it, including webdisk, audio, video, SSH etc.

Yes, of course. The receiving Squid port and current connection state
affects what protocol Squid expects to find/parse, whether Squid may
perform (or has performed) bumping, etc., but all those decisions and
actions happen before and/or after the new feature kicks in. The
detection of an unsupported protocol and computation of the
corresponding tunnel/respond decision is what we focus on here.


> I think it was Amos who said that application builders hould use
> application-specific
> ports, but reality is that all firewalls block those ports by default.

Yes, the "each application has a dedicated port" idea[l] is too far from
reality to matter.


>>> Basically
>>> this means that admins desire a more fine-grained control about what to
>>> do with each tunnel.
>>
>> There are two different needs here, actually:
>>
>> 1. A choice of actions (i.e., "what to do") when dealing with an
>> unsupported protocol. Currently, there is only one action: Send an HTTP
>> error response. The proposed feature adds another action (tunnel) and,
>> more importantly, adds a configuration interface to support more actions
>> later.
> 
> Sending an HTTP error to an application that does not speak HTTP is not
> very useful.  Skype, SSH, videoplayers etc. only get confused at best.
> Simply closing the tunnel may be better and may result in an end user
> message
> 'cannot connect to ...' instead of 'server sends garbage' or 'undefined
> protocol'.


I may have already mentioned that "terminate" is another candidate
action. For now, it is possible to emulate that action in some
deployments by using custom "TCP_RESET" error bodies, but that hack is
too rigid: It does not discriminate the error context/environment. That
is what ACLs in the new feature are meant to do.

Whether it is better to simply terminate the connection or send some
diagnostic text to the user client should be up to the admin to decide.
Different deployment environments will benefit from different actions.
The proposed feature builds the initial framework for those decisions.


Cheers,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Moved PID file management from Coordinator to Master

2015-01-15 Thread Alex Rousskov
On 01/14/2015 03:09 AM, Amos Jeffries wrote:
>> On 01/14/2015 11:25 AM, Amos Jeffries wrote:
>>> Does the master process get exit status of *all* worker processes
>>> and the sub-childs down N levels? It was my understanding that in
>>> SMP each worker disker etc is a fork() and the child becomes new
>>> coordinator.


Hi Amos,

In SMP, there is only one Coordinator process, created by the Master
process.


> I suspect we will find that some diskers etc are in fact spawned by
> either coordinator or a worker and one level deeper than the master
> can see.


All SMP kids (Coordinator, workers, and diskers) are started by the
Master process. There are no multiple levels as far as kid startup and
waiting are concerned and, hence, there is no "level deeper than the
master can see".

Needless to say, there are processes that are not SMP kids. For example,
all helpers do not run Squid code and, hence, they are not kids and are
not affected by the proposed fix.


HTH,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Moved PID file management from Coordinator to Master

2015-01-19 Thread Alex Rousskov
On 01/16/2015 08:51 AM, Amos Jeffries wrote:
> On 16/01/2015 11:29 a.m., Alex Rousskov wrote:
>> In SMP, there is only one Coordinator process, created by the
>> Master process.

>> All SMP kids (Coordinator, workers, and diskers) are started by
>> the Master process. There are no multiple levels as far as kid
>> startup and waiting are concerned and, hence, there is no "level
>> deeper than the master can see".


> Hmm, okay. Then I have no problem per-se to this change of esponsibility.

Great, thank you.


> I do still think the coordinator needs to remain active until last out
> of the kids though, so they can still use it to coordinate during
> their shutdowns. Having it be the first up and last down would solve a
> few architectural problems where kids need to to collaborate on
> things, like log rotations or broadcasting their availability.

Agreed!


Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] HTTP Response Parser upgrade

2015-01-22 Thread Alex Rousskov
On 01/20/2015 06:58 AM, Amos Jeffries wrote:

> Updated patch attached resolving the issues found by coadvisor in the
> original.


> +size_t
> +Client::needBufferSpace(const SBuf &readBuf, const size_t minSpace) const
> +{
> +size_t space = readBuf.spaceSize(); // available space w/o heroic 
> measures

If the above line is the only readBuf usage in this method, pass
readBuf.spaceSize() as a parameter instead of the whole buffer.


> +if (space < minSpace) {
> +const size_t maxSpace = SBuf::maxSize; // absolute best
> +space = min(minSpace, maxSpace); // do not promise more than asked
> +}

Here and elsewhere in Client::needBufferSpace(), the returned space size
may be lower than minSpace. s/minSpace/wantedSpace/ or something like that.

> + * If the pipe is totally full, don't register the read handler.
> + * The BodyPipe will call our noteMoreBodySpaceAvailable() method
> + * when it has free space again.

The above comment does not match the code. The code does not deal with
read handler registration, only size calculations.


> +size_t adaptation_space =
> +virginBodyDestination->buf().potentialSpaceSize();

Make const if possible.


> +/// determine how much space the buffer needs to reserve
> +size_t needBufferSpace(const SBuf &readBuf, const size_t minSpace) const;

need+something implies a notification or state changing method, which is
not what this method is about. Based on the caller, I suggest to

s/needBufferSpace/calcReadSize/ or
s/needBufferSpace/readSize/




>  void
>  HttpStateData::readReply(const CommIoCbParams &io)
>  {
...
> +assert(!flags.do_next_read); // XXX: should have been set false by 
> mayReadVirginBody()
...
> +assert(Comm::IsConnOpen(serverConnection));
> +assert(io.conn->fd == serverConnection->fd);

These are all new asserts in an async job callback. Should we use
Must()s instead to avoid crashing the whole Squid process when dealing
with what seems to be a transaction-specific problem?


> +// how much we want to read
> +const int read_size = needBufferSpace(inBuf, (limitBuffer - 
> inBuf.length()));

Unsigned->signed conversion. Can you use size_t for read_size?

> +if (read_size < 1) {

... and then convert this to !read_size or (read_size == 0) for clarity?


> +/// amount of message payload/body received so far.
> +int64_t payloadSeen;
> +/// positive when we read more than we wanted
> +int64_t payloadTruncated;

Have you considered making these unsigned?


> +/// parse scan to find the mime headers block for current message
> +bool findMimeBlock(const char *which, size_t limit);

Find is a usually a const method. This method does a lot more than just
finding MIME headers. To fix that and to supply information that the
method name does not already reveal, I would do:

/// returns whether the header was successfully parsed
bool parseMimeHeader(const char *headerKind, const size_t sizeLimit);

Note that the limit parameter is const.


> +int64_t mimeHeaderBytes = 0;
> +// XXX: c_str() reallocates. performance regression.
> +if ((mimeHeaderBytes = headersEnd(buf_.c_str(), buf_.length())) == 
> 0) {
> +if (buf_.length()+firstLineSize() >= limit) {
> +debugs(33, 5, "Too large " << which);
> +parseStatusCode = Http::scHeaderTooLarge;
> +parsingStage_ = HTTP_PARSE_DONE;
> +} else
> +debugs(33, 5, "Incomplete " << which << ", waiting for end 
> of headers");
> +return false;
> +}
> +mimeHeaderBlock_ = buf_.consume(mimeHeaderBytes);
> +debugs(74, 5, "mime header (0-" << mimeHeaderBytes << ") {" << 
> mimeHeaderBlock_ << "}");


The above is better written as

// XXX: c_str() reallocates. performance regression.
if (const int64_t mimeHeaderBytes = ...) {
   mimeHeaderBlock_ = buf_.consume(mimeHeaderBytes);
   debugs(...);
} else {
   ...
   return false;
}

but there is more that needs to be done in the above code:

> +// Squid could handle these headers, but admin does not want to
> +if (messageHeaderSize() >= limit) {
> +debugs(33, 5, "Too large " << which);
> +parseStatusCode = Http::scHeaderTooLarge;
> +return false;
> +}

This should be rewritten to go before we initialize mimeHeaderBlock_
IMO. Otherwise, the damage of keeping the admin-prohibited header is
already done.

> +const int
> +Http::One::ResponseParser::parseResponseStatusAndReason()
> +{
> +if (buf_.isEmpty())
> +return 0;
> +

Remove this check (and fix the tokenizer if it cannot handle empty buffers).

> +// NP: search space is >3 to get terminator character)
> +if(!tok.prefix(status, CharacterSet::DIGIT, 4))
> +return -1; // invalid status
> +// NOTE: multiple SP or non-SP bytes between version and status code 
> are invalid.
> +if (tok.atEnd())

Re: [squid-dev] [PATCH] HTTP Response Parser upgrade

2015-01-23 Thread Alex Rousskov
On 01/23/2015 01:00 AM, Amos Jeffries wrote:
> On 23/01/2015 6:47 a.m., Alex Rousskov wrote:
>> On 01/20/2015 06:58 AM, Amos Jeffries wrote:
>>> +/// determine how much space the buffer needs to reserve +
>>> size_t needBufferSpace(const SBuf &readBuf, const size_t
>>> minSpace) const;

>> need+something implies a notification or state changing method,
>> which is not what this method is about. Based on the caller, I
>> suggest to

>> s/needBufferSpace/calcReadSize/ or s/needBufferSpace/readSize/


> While the current caller is only in http.cc, this is intended for use
> by all Client child classes in their own ways.

The suggested names were protocol-neutral AFAICT.


> I'm going with calcBufferSpaceToReserve() for now.

I doubt you can have a generic reservation algorithm without knowing
what the reservation is for (i.e. "reading" of some sort), but
calcBufferSpaceToReserve() is indeed better than the original
needBufferSpace() on other levels.


>>> +/// parse scan to find the mime headers block for current
>>> message +bool findMimeBlock(const char *which, size_t
>>> limit);
> 
>> Find is a usually a const method. This method does a lot more than
>> just finding MIME headers. To fix that and to supply information
>> that the method name does not already reveal, I would do:
> 
>> /// returns whether the header was successfully parsed bool
>> parseMimeHeader(const char *headerKind, const size_t sizeLimit);
> 
> 
> I've adjusted the documentation to say what state changes happen on true.
> 
> 
> I do object to the use of "header" other than as a modifier to the
> named mime block type like I had it.

Please feel free to insert missing MIME augmentation as needed, of course.


> Would you accept "identify" instead of "find" ?
>  or maybe "grab" is better?

Out of those two, "grab" or "isolate" is better IMO. "Parse" would be
even better because parsing is exactly what this method is doing
(despite all the disclaimers to the contrary :-).


>> but there is more that needs to be done in the above code:
> 
>>> +// Squid could handle these headers, but admin does not want
>>> to +if (messageHeaderSize() >= limit) { +debugs(33,
>>> 5, "Too large " << which); +parseStatusCode =
>>> Http::scHeaderTooLarge; +return false; +}
> 
>> This should be rewritten to go before we initialize
>> mimeHeaderBlock_ IMO. Otherwise, the damage of keeping the
>> admin-prohibited header is already done.
> 
> 
> Done.
> 
> NP: This will prevent the ERR_TOO_BIG and cache.log level 11,2
> displaying the mime block as part of the message when its present but
> outside the desired size range.

I am sure there is a way to have ERR_TOO_BIG and cache.log level 11,2
messages on huge messages if we want that. Parser polishing you are
working on should not prevent that.


>>> * This has now had several hours of regular web browsing by
>>> myself (YouTube, Facebook, some news sites, and the Squid sites)
>>> with nothing noticably going wrong.
> 
>> Any ICAP tests? The HTTP-ICAP interaction has been a source of many
>> bugs before and your patch changes the relevant code. If you have
>> not tested it, please do (or ask others to do it).

> No the ICAP related code is not touched. It still uses the HttpMsg
> parser functions and its own original buffering. These changes only
> touch the server socket (virgin reply) buffer I/O and parse function.

The patch contains/changes adaptation-related code. I am not talking
about parsing ICAP messages, but about coordination between HTTP code
and ICAP/eCAP code. Anything inside USE_ADAPTATION.


>>> * Polygraph shows a 7 RPS performance loss (0.36%), with trunk
>>> having +/-2 RPS (0.1%) normal fluctuation. I believe this is from
>>> the transitional regression to data-copy into MemBuf for chunked
>>> coding.

>> Did Polygraph actually use chunked responses in these tests?

> Unsure, but I think so.

I suspect it did not. If I read the code correctly, Polygraph supports
dechunking of received responses but does not chunk the messages it
sends (because Polygraph knows their size). We will add that support!


Cheers,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] HTTP Response Parser upgrade

2015-01-23 Thread Alex Rousskov
On 01/20/2015 06:58 AM, Amos Jeffries wrote:

> Updated patch attached resolving the issues found by coadvisor in the
> original.

> +// HTTP Response status-line parse
> +
> +// magic contains major version, still need to find minor
> +SBuf verMinor;
> +// NP: we limit to 2-digits for speed, there really is no limit
> +// XXX: the protocols we accept dont have valid versions > 10 anyway
> +if (!tok.prefix(verMinor, CharacterSet::DIGIT, 2))
> +return -1; // invalid version minor code
> +if (tok.atEnd())
> +return 0; // need more to be sure we have it all
> +if(!tok.skip(' '))
> +return -1; // invalid version, a single SP terminator required
> +
> +debugs(74, 6, "found string version-minor=" << verMinor);
> +
> +// get the actual numeric value of the 0-3 digits we found
> +::Parser::Tokenizer t2(verMinor);
> +int64_t tvm = 0;
> +if (!t2.int64(tvm))
> +return -1; // ouch. digits not forming a valid number?

Parsing problems similar to those discussed earlier.


> +int retcode = parseResponseFirstLine();

Make const if possible.


Thank you,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Initial libsecurity API

2015-01-26 Thread Alex Rousskov
On 01/14/2015 08:50 AM, Amos Jeffries wrote:
> This is the first step(s) towards a generic TLS/SSL security API for
> Squid.


> +// BUG: ssl_client.sslContext will leak on reconfigure when Config gets 
> memset()
...
> +Config.ssl_client.sslContext = 
> Security::ProxyOutgoingConfig.createContext();

Which memset(Config) call are you referring to here?

> void
> configFreeMemory(void)
> {   
> free_all();
> #if USE_OPENSSL
> SSL_CTX_free(Config.ssl_client.sslContext);
> #endif
> }

And is not Config.ssl_client.sslContext destroyed in the old
configFreeMemory() function quoted above?


> +// it makes more sense to create a context per outbound connection 
> instead of this

Please remove this comment. Since each context may consume gobbles of
RAM, I doubt what you are suggesting always makes more sense, but
discussing this is outside your project scope.


> +NAME: tls_outgoing_options

Please do not forget the recently added SSL_OP_NO_TICKET when merging.


> +} // namespace Security
> +
> +// parse the tls_outgoing_options directive
> +inline void
> +parse_securePeerOptions(Security::PeerOptions *opt)
> +{
> +while(const char *token = ConfigParser::NextToken()) {
> +opt->parse(token);
> +}
> +}
> +
> +#define free_securePeerOptions(x) Security::ProxyOutgoingConfig.clear()
> +#define dump_securePeerOptions(e,n,x) // not supported yet


Please add an XXX to explain why is these are declared outside their
namespace. For example:

XXX: These declarations are outside their namespace because our
generated parsing code cannot handle namespaces.


> +// parse the tls_outgoing_options directive
> +inline void
> +parse_securePeerOptions(Security::PeerOptions *opt)
> +{
> +while(const char *token = ConfigParser::NextToken()) {
> +opt->parse(token);
> +}
> +}

I see no reasons to inline this loop. The related code is slow for other
reasons and not in a critical path. Please do not inline unless really
necessary.


> +#define free_securePeerOptions(x) Security::ProxyOutgoingConfig.clear()
> +#define dump_securePeerOptions(e,n,x) // not supported yet

Why are these #defined? If they can be implemented as regular functions,
they should be IMO.


> +// const loss is okay here, ssl_ex_index_server is only read and 
> not assigned a destructor
> +SSL_set_ex_data(ssl, ssl_ex_index_server, 
> const_cast(&peer->secure.sslDomain));

Perhaps I am missing something, but it looks like you are adding an SBuf
pointer where the rest of the code expects a char* pointer (both
SSL_set_ex_data and SSL_get_ex_data calls with the same
ssl_ex_index_server index). How did this change work or was this path
not tested?

If this is indeed a bug, please go through all ssl_ex_index_server users
to make sure they all agree on the data type. For example:

  bzr grep -n ssl_ex_index_server


> +#if WHEN_PEER_HOST_IS_SBUF
>  else
>  SSL_set_ex_data(ssl, ssl_ex_index_server, peer->host);
> +#endif

Lost of functionality here? Please restore if possible or
explain/document if not.


> +
> +class PeerOptions
> +{

Please add a one-line description to clarify what we mean by "Peer"
here: cache_peer, any server, any client or server, etc.


> +/// generate a security context from the configured options
> +Security::ContextPointer createContext();

Do we need to use the explicit Secutiry:: prefix inside namespace
Security {}?


> +/// generate a security context from the configured options
> +Security::ContextPointer createContext();

If the description is accurate, should not this method be moved outside
the PeerOptions class? And if the move is not possible for some reason,
should not this method be "const"?


> +bool ssl;   ///< whether SSL is to be used on this connection

"this connection" is not clear in a PeerOptions context. Perhaps we can
say "with this peer"?

I thought you were opposed to using ssl for new stuff. Why not call this
"secured" or something like that? You have to change all calling code
anyway, right?


> +bool ssl;   ///< whether SSL is to be used on this connection
> +
> +SBuf certFile;   ///< path of file containing PEM format X509 
> certificate
> +SBuf privateKeyFile; ///< path of file containing private key in PEM 
> format
> +SBuf sslOptions; ///< library-specific options string
> +SBuf caFile; ///< path of file containing trusted Certificate 
> Authority
> +SBuf caDir;  ///< path of directory containign a set of trusted 
> Certificate Authorities
> +SBuf crlFile;///< path of file containing Certificate Revoke List
> +
> +int sslVersion;
> +SBuf sslCipher;
> +SBuf sslFlags;
> +SBuf sslDomain;

Please consider making "sslVersion" and "ssl" last data members (in that
order), to possibly save a tiny bit of padding and avoid giving a bad
order example for other classes (where it may actually matter).


If this change was not tested with and

[squid-dev] [RFC] Secure ICAP

2015-02-03 Thread Alex Rousskov
Hello,

We would like to add support for "Secure ICAP" -- ICAP services that
require SSL/TLS encryption for transport connections. Some other popular
proxies support that feature already, I have seen a trickle of admin
requests for it, and the feature often becomes essential when filtering
bumped traffic using external ICAP services (to preserve the overall
security of the entire message delivery chain).

Today, it is possible to emulate Secure ICAP using connection wrappers
like stunnel. Needless to say, that workaround is not a production-
quality solution.


I think Secure ICAP can use configuration knobs and server validation
logic already implemented for secure HTTP peers (the cache_peer
directive with an ssl option).

To mark an ICAP service as secure in squid.conf, we can use an "icaps"
service URI scheme. The industry is using "secure ICAP" term for this
feature,  but "icaps" seems more appropriate/standard for a scheme name
compared to "sicap".


We will not support dynamic "upgrades" from plain to secure ICAP
connections because:

* there are no ICAP servers that support that (AFAIK);
* there are no ICAP clients that support that (AFAIK);
* such support does not seem to be needed in practice given a rather
tight/long-term bonding between a proxy and an ICAP service (unlike a
relationship between an HTTP client and an origin server);
* such support can be added later, if needed, without redoing much of
the proposed work.

I also do not anticipate changes to the existing ICAP service
_selection_ configuration and related features, implementation of the
ICAP protocol itself, and eCAP.


If there are any objections to adding Secure ICAP support or high-level
suggestions regarding implementation, please let me know!


Thank you,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] Secure ICAP

2015-02-12 Thread Alex Rousskov
On 02/03/2015 07:02 PM, Amos Jeffries wrote:

> My plan for the Crypto-NG / libsecurity work already in audit was to
> followup with a Security::Encryptor AsyncJob that could be passed the
> Comm::Connection object from a newely opened connection plus the
> Security::PeerOptions from squid.conf and initiate TLS/SSL on the
> connection.

We already have PeerConnector -- an async job that encapsulates most of
the security-related logic when it comes to connecting to other servers.
It is used by TunnelStateData, FwdState, and PeerPoolMgr. I doubt you
need another job that does the same.


> That model is reusable for cache_peer, DIRECT connections, ICAP, and any
> other protocols which use the X-over-TLS protocol layering. Without any
> major changes to the existing code - we just update the appropriate Comm
> callback to kick off the new Security::Encryptor job once the server FD
> is opened.
> 
> I would like to push that update forward rather than anyone writing
> another protocol-specific connection setup pathway. A good chunk of the
> preparation work is already done, so the time to completion would also
> be faster than a new project.


Sure, we would gladly reuse anything available as long as we do not have
to wait for it. AFAICT, if we use PeerConnector for Secure ICAP, and
your NG changes go inside PeerConnector, then we will not be stepping on
each other toes too much.


Thank you,

Alex.



> On 4/02/2015 9:52 a.m., Alex Rousskov wrote:
>> Hello,
>>
>> We would like to add support for "Secure ICAP" -- ICAP services that
>> require SSL/TLS encryption for transport connections. Some other popular
>> proxies support that feature already, I have seen a trickle of admin
>> requests for it, and the feature often becomes essential when filtering
>> bumped traffic using external ICAP services (to preserve the overall
>> security of the entire message delivery chain).
>>
>> Today, it is possible to emulate Secure ICAP using connection wrappers
>> like stunnel. Needless to say, that workaround is not a production-
>> quality solution.
>>
>>
>> I think Secure ICAP can use configuration knobs and server validation
>> logic already implemented for secure HTTP peers (the cache_peer
>> directive with an ssl option).
>>
>> To mark an ICAP service as secure in squid.conf, we can use an "icaps"
>> service URI scheme. The industry is using "secure ICAP" term for this
>> feature,  but "icaps" seems more appropriate/standard for a scheme name
>> compared to "sicap".
>>
>>
>> We will not support dynamic "upgrades" from plain to secure ICAP
>> connections because:
>>
>> * there are no ICAP servers that support that (AFAIK);
>> * there are no ICAP clients that support that (AFAIK);
>> * such support does not seem to be needed in practice given a rather
>> tight/long-term bonding between a proxy and an ICAP service (unlike a
>> relationship between an HTTP client and an origin server);
>> * such support can be added later, if needed, without redoing much of
>> the proposed work.
>>
>> I also do not anticipate changes to the existing ICAP service
>> _selection_ configuration and related features, implementation of the
>> ICAP protocol itself, and eCAP.
>>
>>
>> If there are any objections to adding Secure ICAP support or high-level
>> suggestions regarding implementation, please let me know!
>>
>>
>> Thank you,
>>
>> Alex.
>> ___
>> squid-dev mailing list
>> squid-dev@lists.squid-cache.org
>> http://lists.squid-cache.org/listinfo/squid-dev
>>
> 
> ___
> squid-dev mailing list
> squid-dev@lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-dev
> 

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Moved PID file management from Coordinator to Master

2015-03-06 Thread Alex Rousskov
On 01/21/2015 05:03 AM, Amos Jeffries wrote:
> On 22/01/2015 12:57 a.m., Tsantilas Christos wrote:
> 
>> I am posting a new patch.
> 
>> This patch include fixes to follow Squid Coding Style but also have
>> a fix for a small bug: In the previous patches I posted the pid
>> creation done by master process but the pid file was not removed by
>> master process. This patch fixes master process to remove pid file
>> on exit.
> 
> 
> +1.


Amos,

This patch went into trunk (r13867 and r13868). Do you plan on
merging it (together with the pending "start workers as root" follow up)
into v3.5 branch? It is your call, of course, but it is essentially a
bug fix (SMP Squid does not restart reliably on busy servers without
these changes). We can provide a v3.5-specific [combined] patch if that
will make a positive decision easier for you.


Thank you,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] Squid 4.0 ideas

2015-03-12 Thread Alex Rousskov
On 03/10/2015 04:58 AM, Amos Jeffries wrote:
> On 10/03/2015 5:41 a.m., Alex Rousskov wrote:
>> On 03/07/2015 10:04 PM, Amos Jeffries wrote:
>>
>>> Proposal 2)
>>>
>>> We are developing Squid with an incremental development process. The
>>> initial major version number is effectively meaningless in that process.
>>> We should move from the major.minor.patch to just a release.patch
>>> numbering system.
>>>
>>> This would mean this years upcoming major series would be 4.x, and next
>>> years would be 5.x and so on.
>>
>> My understanding of what has been proposed as release.patch is
>> different: There is no artificial year boundary. If you add something
>> new, you increment the release (i.e., first) number. If you just fix
>> something old, you increment the patch (i.e., second) number.


> Its already clear from past discussions that our views of "feature" are
> very different. If we bumped the version with each "new thing"  ...

I did not say "each". Squid releases happen when you make them happen,
and I am not discussing release timing. I am only talking about how to
number the releases. The release.patch algorithm is simple: If you
decide to make a release containing at least one new feature, increment
the release counter. Otherwise, just increment the patch level.


>> The advantage is in removing a concept of a "major release" and the
>> often artificial boundaries associated with it. For example, you would
>> not have to search for an excuse to release 4.0 like you are doing now.
> 
> I'm not searching for excuses. There is a meta-pattern of previous major
> numbers changing roughly when the language is changed.

There is no such pattern IMO. There is just one historical case that
fits the alleged pattern (v2->v3). Before v3 (v0, v1, and v2), Squid was
written in C. Since v3, Squid is written in C++. You may, of course, try
to establish such a pattern for future major releases.

AFAIK, nobody has strong objections either way -- version numbers are
not very important. I am just trying to document how the simpler
(release.patch) numbering may work and what its advantages/disadvantages
are.


HTH,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] Squid 4.0 ideas

2015-03-12 Thread Alex Rousskov
On 03/07/2015 10:04 PM, Amos Jeffries wrote:

> Proposal 2)
> 
> We are developing Squid with an incremental development process. The
> initial major version number is effectively meaningless in that process.
> We should move from the major.minor.patch to just a release.patch
> numbering system.
> 
> This would mean this years upcoming major series would be 4.x, and next
> years would be 5.x and so on.

My understanding of what has been proposed as release.patch is
different: There is no artificial year boundary. If you add something
new, you increment the release (i.e., first) number. If you just fix
something old, you increment the patch (i.e., second) number.

The advantage is in removing a concept of a "major release" and the
often artificial boundaries associated with it. For example, you would
not have to search for an excuse to release 4.0 like you are doing now.

The disadvantage is in removing a concept of "major upgrade pains" and
the often real boundaries associated with it. For example, it would be
more difficult for you to explain that jumping from 11.4 to 12.4
requires a week of system administration and testing work while going
from 11.4 to 21.4 does not.

You can still have LTS-like releases, of course. The versioning scheme
does not affect that at all.


> There are quite a lot of infrastructure changes involved with that big a
> change, and its not entirely clear how the beta releases would be
> represented - perhapse not having any beta releases at all?

As Kinkie have said, you can apply the current solution for beta
releases by adding an extra number after zero. For the release.patch
scheme, betas would look like release.0.patch.

I remain to be skeptical about the advertised correlation between our
initial stable releases and actual, real-world stability. Our official
beta process seems to mislead as much as it informs. Thus, I welcome
experiments that would get rid of that process.

At the end of the day, versioning does not really matter much IMO. The
important things are stability and timeliness (i.e., release delay for
features in trunk) of our releases.


HTH,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] [PREVIEW] Fixed reporting of NULL header characters

2016-08-04 Thread Alex Rousskov
Hello,

Our Level-1 "WARNING: HTTP header contains NULL characters" messages
were broken as was the level-7 reporting of the headers being parsed.
Level-2 reporting of "HTTP Server RESPONSE" headers (and probably raw
data in other contexts) was and still is broken. These lying-debugging
problems could waste quite a bit of time during triage.

Please see the patch preamble for details. I believe similar problems
affect both v3.5 and trunk although I have been working only with trunk
during triage. Sample debugging logs are quoted below.

I am posting this patch as a PREVIEW to warn folks about these bugs and
to ask for volunteers to complete the two remaining TODOs detailed in
the preamble. Would anybody be willing to be responsible for this work?


Thank you,

Alex.

-- unpatched v3.5 output --

> 2016/08/04 23:16:48.738| 55,7| HttpHeader.cc(597) parse: parsing hdr: 
> (0x23ea928)
> Before-Null: foo
> 
> 2016/08/04 23:16:48.738| WARNING: HTTP header contains NULL characters 
> {Before-Null: foo
> }
> NULL
> {Before-Null: foo
> 
> 2016/08/04 23:16:48.738| 55,7| HttpHeader.cc(480) clean: cleaning hdr: 
> 0x23ea928 owner: 3


-- patched trunk output ---

> 2016/08/04 23:13:40.879| 55,7| HttpHeader.cc(329) parse: 0x175f668 got 
> buf[171]={
> Before-Null: foo
> .After-Null: bar
> Server: DaftServer/1.0
> Connection: close
> Date: Fri, 05 Aug 2016 05:13:40 GMT
> X-Daft-Response-ID: midcc64dd2c
> Content-Length: 12
> 
> }
> 2016/08/04 23:13:40.879| WARNING: HTTP header contains a NULL character; 
> before[18]={
> Before-Null: foo
> } and after[152]={
> After-Null: bar
> Server: DaftServer/1.0
> Connection: close
> Date: Fri, 05 Aug 2016 05:13:40 GMT
> X-Daft-Response-ID: midcc64dd2c
> Content-Length: 12
> 
> }
> 2016/08/04 23:13:40.879| 55,7| HttpHeader.cc(184) clean: cleaning hdr: 
> 0x175f668 owner: 3
Fixed reporting of NULL header characters. Improved HTTP header debugging.

Level-1 "WARNING: HTTP header contains NULL characters" messages were broken:
1. They were printing half of the headers twice and not printing the other half.
2. There could be just one NULL character, not multiple characters.

Level-7 reporting of the headers being parsed was broken:
3. Squid did not report header characters starting with NULL.

Level-2 reporting of "HTTP Server RESPONSE" headers (and probably raw data in
other contexts) is still broken for the same reason as #3 above.

Problem #1 was caused by incorrect getStringPrefix() usage. That function cannot
be called twice in the same debugs() message. We should be using Raw API
instead. This patch replaces several getStringPrefix() calls to illustrate how
this could be done (and test the output).

Problem #2 is just poor wording of the WARNING message. Now fixed.

Problem #3 was caused by getStringPrefix() inability to print buffers with NULL
characters. Using Raw is the right solution for that. Old Raw code could print
NULL characters, but our debugs() cannot handle them. To combat that, the new
Raw::defaultprint() method now replaces each NULL character with a '.', like xxd
does. This replacement would be necessary until somebody rewrites our
_db_print() API or its callers to handle NULL chars correctly (but the current
Raw code can be optimized even without that _db_print() rewrite).


TODO: If we want to keep the 512-byte header reporting limit provided by
getStringPrefix(), then we should add Raw::maxVolume() to trigger a similar
debugging size limit and add Raw::mime() that will call both maxVolume() and
scoped(). Alternatively, we can limit _all_ Raw output to 512 chars. Either way,
that getStringPrefix() limit should be preseved IMO.

TODO: Replace all remaining getStringPrefix(...) calls with Raw(...).scoped()
or Raw(...).mime() and adjust the affected debugs() lines accordingly.

=== modified file 'src/Debug.h'
--- src/Debug.h	2016-05-23 17:03:20 +
+++ src/Debug.h	2016-08-05 02:57:36 +
@@ -153,58 +153,65 @@ HERE(std::ostream& s)
 /* some uint8_t do not like streaming control-chars (values 0-31, 127+) */
 inline std::ostream& operator <<(std::ostream &os, const uint8_t d)
 {
 return (os << (int)d);
 }
 
 /* Legacy debug function definitions */
 void _db_init(const char *logfile, const char *options);
 void _db_print(const char *,...) PRINTF_FORMAT_ARG1;
 void _db_set_syslog(const char *facility);
 void _db_rotate_log(void);
 
 /// Prints raw and/or non-terminated data safely, efficiently, and beautifully.
 /// Allows raw data debugging in debugs() statements with low debugging levels
 /// by printing only if higher section debugging levels are configured:
 ///   debugs(11, DBG_IMPORTANT, "always printed" << Raw(may be printed...));
 class Raw
 {
 public:
 Raw(const char *label, const char *data, const size_t size):
-level(-1), label_(label), data_(data), size_(size), useHex_(false) {}
+level(-1), label_(label), data_(data), size_(size), useHex_(false),
+formScope_(false) {}
 
 /// limit data printing to at

Re: [squid-dev] [PATCH] GnuTLS session redo

2016-08-05 Thread Alex Rousskov
On 08/03/2016 11:57 PM, Amos Jeffries wrote:
> +Security::SetSessionResumeData(const Security::SessionPointer &s, const 
> Security::SessionStatePointer &data)
> +{
> +if (data) {
> +#if USE_OPENSSL
> +(void)SSL_set_session(s.get(), data.get());
> +#elif USE_GNUTLS
> +(void)gnutls_session_set_data(s.get(), data->data, data->size);
> +#endif
> +}
> +}

Why cast to (void) here? The current code does not do that AFAICT.
Casting the result to (void) usually means

* I know this may fail, but I do not care.

It should be accompanied by a comment that explains why we do not care,
unless the code makes it obvious (the patched code does not).


Not casting may mean many things, including:

* It returns void.
* I did not even think about it.
* I know it may fail, but I either do not know how to handle its failure
or just do not have the time to handle it.

AFAICT, in this particular case, the current code probably matches one
of the last two bullets. If I am right, then there are two correct ways
to deal with it IMO:

1. Return a boolean success/failure result.
   The caller will still ignore the result.

2. Remove "(void)". Ideally, add a "// TODO: throw on failures" comment.

My weak recommendation is for #2.


> +Security::SessionIsResumed(const Security::SessionPointer &s)
> +{
> +return
> +#if USE_OPENSSL
> +SSL_session_reused(s.get()) == 1;
> +#elif USE_GNUTLS
> +gnutls_session_is_resumed(s.get()) != 0;
> +#else
> +false;
> +#endif
> +}
> +
> +void
> +Security::GetSessionResumeData(const Security::SessionPointer &s, 
> Security::SessionStatePointer &data)
> +{
> +if (!SessionIsResumed(s)) {
> +#if USE_OPENSSL
> +data.reset(SSL_get1_session(s.get()));
> +#elif USE_GNUTLS
> +gnutls_datum_t *tmp = nullptr;
> +(void)gnutls_session_get_data2(s.get(), tmp);
> +data.reset(tmp);
> +#endif
> +}
> +}

Can we establish some rules for when to use "#else" in
OpenSSL/GnuTLS/Security contexts? The inconsistency illustrated above is
a red flag. I do not think "use #else if it is needed to compile" is the
rule used for the above code because Security::GetSessionResumeData()
may produce an "unused parameter 'data'" warning, killing the build.


> +/// Retrieve the data needed to resume this session on a later connection
> +void GetSessionResumeData(const Security::SessionPointer &, 
> Security::SessionStatePointer &);

> +void
> +Security::GetSessionResumeData(const Security::SessionPointer &s, 
> Security::SessionStatePointer &data)
> +{
> +if (!SessionIsResumed(s)) {

The implementation does not seem to match the description. The
implementation matches the current code, but the description says
nothing about !SessionIsResumed() precondition. Please either change
description and rename OR test the precondition in the callers,
whichever you think is better long-term.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] GnuTLS session redo

2016-08-05 Thread Alex Rousskov
On 08/05/2016 02:13 PM, Amos Jeffries wrote:
> On 6/08/2016 6:37 a.m., Alex Rousskov wrote:
>> On 08/03/2016 11:57 PM, Amos Jeffries wrote:
>>> +Security::SetSessionResumeData(const Security::SessionPointer &s, const 
>>> Security::SessionStatePointer &data)
>>> +{
>>> +if (data) {
>>> +#if USE_OPENSSL
>>> +(void)SSL_set_session(s.get(), data.get());
>>> +#elif USE_GNUTLS
>>> +(void)gnutls_session_set_data(s.get(), data->data, data->size);
>>> +#endif
>>> +}
>>> +}


>> Why cast to (void) here? The current code does not do that AFAICT.
>> Casting the result to (void) usually means
>>
>> * I know this may fail, but I do not care.

>> It should be accompanied by a comment that explains why we do not care,
>> unless the code makes it obvious (the patched code does not).


> Failures there just mean a clean new session is used.

I do not think SSL_set_session() fails when "a clean new session is
used" but perhaps I am misinterpreting what you mean by that.

If you meant that an SSL_set_session() failure is not important because
a new session will be eventually created instead of the old one being
reused, then I agree regarding what will happen but disagree regarding
that failure unimportance. Session reuse is very important in some
environments, so the proposed "I do not care why it did not happen" is
the wrong approach. We do care, at least for triage/debugging purposes.

That kind of logic would be very similar to saying "I do not care that
nothing gets cached due to a failed disk -- the users will get their
responses from the origin server instead". While they will indeed, we
still do care as far as stats/logging/debugging are concerned.


> I dont see any reason to throw. Resume is just a speed optimization for TLS.

We should throw if the callers need to know about the failure. We should
not throw if the callers do not care about the failure. Resume being an
optional optimization is irrelevant in this particular decision --
either the callers have some cleanup/setup/debugging code to accompany
this call or they do not.


>>> +Security::SessionIsResumed(const Security::SessionPointer &s)
>>> +{
>>> +return
>>> +#if USE_OPENSSL
>>> +SSL_session_reused(s.get()) == 1;
>>> +#elif USE_GNUTLS
>>> +gnutls_session_is_resumed(s.get()) != 0;
>>> +#else
>>> +false;
>>> +#endif
>>> +}
>>> +
>>> +void
>>> +Security::GetSessionResumeData(const Security::SessionPointer &s, 
>>> Security::SessionStatePointer &data)
>>> +{
>>> +if (!SessionIsResumed(s)) {
>>> +#if USE_OPENSSL
>>> +data.reset(SSL_get1_session(s.get()));
>>> +#elif USE_GNUTLS
>>> +gnutls_datum_t *tmp = nullptr;
>>> +(void)gnutls_session_get_data2(s.get(), tmp);
>>> +data.reset(tmp);
>>> +#endif
>>> +}
>>> +}
>>
>> Can we establish some rules for when to use "#else" in
>> OpenSSL/GnuTLS/Security contexts?

> Sure. So far I've been trying to only add #else when there was some
> meaningful code that needed to be done in those builds.

Not sure I understand the rule you are proposing (if you are proposing a
rule). Can you detail/rephrase it? Please keep the following two
questions in mind when doing so:

1. What are these rules trying to guarantee or accomplish?
   In other words, what will happen if we [do not] follow them?

2. If I see no "#else" (or no "#elif"), what does that mean?
   * We have not checked yet. It may not run or even compile.
   * We know it compiles (and runs?) without #else (or #elif).
   * Something should go there but we have not figured out what yet.
   * We know that nothing should go there; all callers will be fine.
   Etc.


> Usually it is for a default return statement or setting up error results
> indicating the related TLS/SSL feature not working. 

Should the "default return statement" always indicate an error of some
sort? If not, how to know when to implement the default success route
(your first reason) and when to set up an error (your second reason)?


> Code that is only
> run when the feature being used usually does not need #else.

Without a good set of rules, this gets messy very quickly IMO. For
example, your own GetSessionResumeData() does not reset the data
parameter in the [absent] #else case but does unconditionally reset it
(possibly to a nil value) in other cases. Is this a bug waiting to
happen, and intended side effect, or both? Without rules, it is
difficult to say for sure.


> I've built this w

Re: [squid-dev] [PATCH] GnuTLS session redo

2016-08-05 Thread Alex Rousskov
On 08/04/2016 03:40 AM, Amos Jeffries wrote:
> On 4/08/2016 5:57 p.m., Amos Jeffries wrote:

> I'm also wondering if it would be useful to add debugs() in the Get/Set
> functions for debugging session resume usage.

If we do not have that already, I think it would be very useful to add
debugging that shows whether the session was reused (and if not reused,
why). Whether that debugging should go into those Get/Set functions, I
have not checked. If those functions are called once per connection,
then they may be good candidates.


> Would it be worth the performance hit?

Is there a performance hit associated with adding a few debugs()
statements that will be checked once or twice per SSL connection!?


>  if so, at what debug level?

Just a suggestion:

3 for not reusing for some non-trivial reason
3 for failing to prepare future reuse for some non-trivial reason
5 for reusing (once per connection?)
5 for successfully preparing future reuse
7 for all the trivial cases and everything else


HTH,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Broken trunk rev14778

2016-08-08 Thread Alex Rousskov
On 08/08/2016 04:43 AM, Amos Jeffries wrote:

>   r14778: Move static member Last into change() method to avoid 
> initialization order
>   errors when a caller uses a global InstanceId object before the library
>   instantiating its template is initialized.

Have you seen these Last errors? Constant initialization like "static
int Last = 0" happens before any code runs. If you saw these errors,
there must be something special about templates in this context that I
do not know about (and would like to learn).

  http://en.cppreference.com/w/cpp/language/constant_initialization


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Broken trunk rev14778

2016-08-08 Thread Alex Rousskov
On 08/08/2016 01:19 PM, Amos Jeffries wrote:
> On 9/08/2016 6:01 a.m., Alex Rousskov wrote:
>> On 08/08/2016 04:43 AM, Amos Jeffries wrote:
>>
>>>   r14778: Move static member Last into change() method to avoid 
>>> initialization order
>>>   errors when a caller uses a global InstanceId object before the library
>>>   instantiating its template is initialized.

>> Have you seen these Last errors?


> I don't think the pre-initialization of simple types is the issue here.
> Or maybe Last being an int and thus using that pre-initialization
> instead of having a contructor symbol is the root of the compilers
> confusion.

> It has been occuring on the d,debian-clang farm node for the last few
> weeks. Relevant part of the log output below.

> ../../../../src/base/InstanceId.h:28:27: error: instantiation of
> variable 'InstanceId::Last' required here, but no definition is
> available [-Werror,-Wundefined-var-template]
> InstanceId(): value(++Last ? Last : ++Last) {}


> The suggested declaration turns out not to e very portable, so I went
> with moving the global symbol into a method.

My interpretation of what you are saying is: "I have not seen any
initialization order errors that r14778 commit message talks about. I
have seen compiler errors. The true reasons behind those compiler errors
are not well understood, but moving Last into a method avoids errors in
my tests".

Is that interpretation accurate?


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] eCAP auto-enable

2016-08-08 Thread Alex Rousskov
On 08/08/2016 08:06 AM, Amos Jeffries wrote:
> IMO, eCAP has pretty much stabilized.

We will need a new major eCAP version/release to accommodate
backwards-incompatible C++11 builds as discussed at

  http://bugs.squid-cache.org/show_bug.cgi?id=4376#c19
  https://bugs.launchpad.net/ecap/+bug/1595488


> Any objections to auto-enabling it whenever available?

I fear the above problem needs to be solved first, but my response is
not an objection -- this is not my area of expertise...

IIRC, the primary reason why eCAP was not enabled by default (where
available) was _not_ its stability but the fact that it is trivial to
inject arbitrary code into Squid (via squid.conf) when loadable modules
are enabled. IIRC, folks thought that there is too much risk in enabling
that squid.conf vulnerability/attack vector by default.


> That would also mean adding it as a builddependency on our farm nodes.
> So we can explicitly fail build if eCAP code is overlooked in changes
> like rev.14778.

That should have been done long time ago IMHO. Testing eCAP builds
should not depend on whether it is enabled by default.


Cheers,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Ipc::MemMap::ttl removal

2016-08-08 Thread Alex Rousskov
On 08/08/2016 08:12 AM, Amos Jeffries wrote:
> Coverity Scan latest checks are reporting that the ttl member of
> Ipc::MemMap is being left uninitialized.
> 
> It sounds like something which would lead to major bugs. Except that it
> turns out, AFAICS, that this ttl member is never actually being used.
> 
> Anyone have a reason not to simply drop it from the code?

There cannot be valid reasons to keep unused code for more than a year.
Please remove it.

FWIW, I bet it is just accidental leftovers from some earlier MemMap design.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Make Squid death due to overloaded helpers optional

2016-08-09 Thread Alex Rousskov
On 08/09/2016 05:38 AM, Eduard Bagdasaryan wrote:
> On 08/08/2016 02:17 PM, Amos Jeffries wrote:
>> * helper::SubmissionFailure is also changing what was previously
> Helper::Unknown result codes to Helper::Error.
>  - Helper::Error is one of the helper output codes, it means success.
> Obviousy when hlp == NULL, there is no success going on.
>  - please revert to sending helper::Unknown code.

> Reverted. Note that for helper::Unknown some callers print a bit
> misleading error message, when requests are dropped, e.g.:
> "ERROR: storeID helper returned invalid result code. Wrong helper?"

and the documentation now lies:

> + Two [on-persistent-overload] actions are supported:
> +
> +   die   Squid worker quits. This is the default behavior.
> +
> +   err   Squid treats the helper request as if it was
> + immediately submitted, and the helper 
> immediately 
> + responded with an error. This action has no 
> effect on
> + the already queued and in-progress helper 
> requests.

Perhaps SubmissionFailure() should use Helper::Unknown when its hlp
parameter is nil and Helper::Error in other cases?


FYI: The original intent and the documentation say that we should return
Helper::Error. Yes, Helper::Error means a "successfully parsed" helper
response [with helper-specific semantics], and IIRC, that is what we
wanted to relay to the helper-using code because that code has a better
chance of doing the right thing when receiving Helper::Error (compared
to handling Helper::Unknown).

We also considered adding a new result code for overload but rejected
that idea for several reasons, including the expectation that the
helper-using code already has logic to handle Helper::Error and that
logic will work for on-persistent-overload=err cases well.

Please note that I am not advocating a _change_ in the current Squid
default behavior. I am only questioning whether it is a good idea to
return Helper::Unknown instead of Helper::Error with
on-persistent-overload=err. Amos' arguments do not apply to that
specific case AFAICT because there is no "change" in that specific case
(it is a new behavior).


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Make Squid death due to overloaded helpers optional

2016-08-09 Thread Alex Rousskov
On 08/09/2016 11:39 AM, Amos Jeffries wrote:
> On 10/08/2016 4:41 a.m., Alex Rousskov wrote:
>> On 08/09/2016 05:38 AM, Eduard Bagdasaryan wrote:
>>> On 08/08/2016 02:17 PM, Amos Jeffries wrote:
>>>> * helper::SubmissionFailure is also changing what was previously
>>> Helper::Unknown result codes to Helper::Error.
>>>  - Helper::Error is one of the helper output codes, it means success.
>>> Obviousy when hlp == NULL, there is no success going on.
>>>  - please revert to sending helper::Unknown code.
>>
>>> Reverted. Note that for helper::Unknown some callers print a bit
>>> misleading error message, when requests are dropped, e.g.:
>>> "ERROR: storeID helper returned invalid result code. Wrong helper?"
>>
>> and the documentation now lies:
>>
>>> +   Two [on-persistent-overload] actions are supported:
>>> +
>>> + die   Squid worker quits. This is the default behavior.
>>> +
>>> + err   Squid treats the helper request as if it was
>>> +   immediately submitted, and the helper 
>>> immediately 
>>> +   responded with an error. This action has no 
>>> effect on
>>> +   the already queued and in-progress helper 
>>> requests.
>>
>> Perhaps SubmissionFailure() should use Helper::Unknown when its hlp
>> parameter is nil and Helper::Error in other cases?
>>
>>
>> FYI: The original intent and the documentation say that we should return
>> Helper::Error. Yes, Helper::Error means a "successfully parsed" helper
>> response [with helper-specific semantics], and IIRC, that is what we
>> wanted to relay to the helper-using code because that code has a better
>> chance of doing the right thing when receiving Helper::Error (compared
>> to handling Helper::Unknown).
>>
>> We also considered adding a new result code for overload but rejected
>> that idea for several reasons, including the expectation that the
>> helper-using code already has logic to handle Helper::Error and that
>> logic will work for on-persistent-overload=err cases well.
>>
>> Please note that I am not advocating a _change_ in the current Squid
>> default behavior. I am only questioning whether it is a good idea to
>> return Helper::Unknown instead of Helper::Error with
>> on-persistent-overload=err. Amos' arguments do not apply to that
>> specific case AFAICT because there is no "change" in that specific case
>> (it is a new behavior).
>>
> 
> At the moment Helper::Unknown means a problem in Squid side of things
> (fail). BH means a problem internally to the helper (fail). And both
> OK/ERR mean helper decision was made (success).
>  The naming may not great, and the way they map down to boolean
> allow/deny type values is very painful, but thats what we have to work
> with currently.

Yep, that matches both my understanding and motivation to return ERR in
the explicitly configured on-persistent-overload=err case.


> The decision of whether Unknown maps to ERR is something each different
> helper interface needs to decide for itself based on what its purpose is.

The decision what fake response object to give the helper code [when
helper itself is overloaded] should be up to the admin. The admin wants
"as if the helper returned ERR". We can (and perhaps should) add more
on-persistent-overload values later, of course.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Make Squid death due to overloaded helpers optional

2016-08-10 Thread Alex Rousskov
On 08/09/2016 06:19 PM, Henrik Nordström wrote:
> tis 2016-08-09 klockan 11:47 -0600 skrev Alex Rousskov:
>>
>> Yep, that matches both my understanding and motivation to return ERR
>> in the explicitly configured on-persistent-overload=err case.

> I'd say make it configurable.
> 
> on-persistent-overload=abort/err/ok/unknown
> 
> to represent all three possible states.

Agreed (as a TODO for somebody who has a corresponding use case). We are
laying the framework for that future enhancement, but should not be
forced to implement it, especially because returning OK responses will
probably require more configuration and perhaps even helper-specific code.


> But that is based on the assumption that unknown is "outcome is not
> known" and that Squid then skips the access rule and looks at next
> rule. Have not looked in detail in how current Squid operated with
> these conditions.

The uncertainty about "unknown" is one more reason to leave enabling
that as a TODO for those with a use case.


> Starting with only abort/err is good. err should then map to the same
> as a negative helper response which is the most clear buldingblock to
> work with then bulding access lists.
> 
> It does not make sense that err in this context should map to different
> logics depending on the directive.

Agreed, and the take #10 implementation posted earlier appears to
implement the "err" case correctly AFAICT.

As Amos has noted, we do need to restore the old "unknown" behavior when
the helper is _missing_ (and not overloaded), but that is a completely
different problem with a simple solution: SubmissionFailure() should use
Helper::Unknown when its hlp parameter is nil and Helper::Error.


Cheers,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Make Squid death due to overloaded helpers optional

2016-08-11 Thread Alex Rousskov
On 08/11/2016 06:53 AM, Amos Jeffries wrote:
> On 11/08/2016 11:50 p.m., Eduard Bagdasaryan wrote:
>> 2016-08-10 19:03 GMT+03:00 Alex Rousskov
>> :
>>
>>> As Amos has noted, we do need to restore the old "unknown" behavior when
>>> the helper is _missing_ (and not overloaded), but that is a completely
>>> different problem with a simple solution: SubmissionFailure() should use
>>> Helper::Unknown when its hlp parameter is nil and Helper::Error.
>>
>> Adjusted accordingly and updated the patch.


> Since we seems to have consensus that the 'err' value means ERR, and
> other codes will be added later.
> 
> The config value should likewise be the result code, uppercase "ERR",
> and the wording "responded with an error" be "responded with ERR". That
> makes it crystal clear that whatever the helper ERR handling is will happen.

I agree. Please use "and the helper immediately replied with an ERR
response" wording.


> If there are no further objections, I will apply it tomorrow (~12hrs)
> with the above change.

No objections from me. While committing:


> +callback(data, hlp ? Helper::Reply(Helper::Error) : 
> Helper::Reply(Helper::Unknown));

If possible, move the tertiary operator inside Helper::Reply(...) to
avoid "Helper::Reply" code duplication:

  Helper::Reply(hlp ? Helper::Error : Helper::Unknown)


>   This options sets default queue-size option of the url_rewrite_children
>   to 0.
> +
>  DOC_END

Please undo this non-change.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Incorrect processing of long URIs

2016-08-22 Thread Alex Rousskov
On 08/22/2016 04:24 PM, Eduard Bagdasaryan wrote:

> -// Limit to 32 characters to prevent overly long sequences of non-HTTP
> -// being sucked in before mismatch is detected. 32 is itself annoyingly
> -// big but there are methods registered by IANA that reach 17 bytes:
> -//  http://www.iana.org/assignments/http-methods
> -static const size_t maxMethodLength = 32; // TODO: make this 
> configurable?
>  
>  SBuf methodFound;
> -if (!tok.prefix(methodFound, CharacterSet::TCHAR, maxMethodLength)) {
> +if (!tok.prefix(methodFound, CharacterSet::TCHAR, MaxMethodLength)) {
>  debugs(33, ErrorLevel(), "invalid request-line: missing or malformed 
> method");


I do not understand why you decided to use maxMethodLength in
parseRequestFirstLine(). AFAICT, parseMethodField() already does
everything we need: It logs an error message and sets parseStatusCode
accordingly. I would expect method-specific code to remain in
parseMethodField() where they belong and parseRequestFirstLine() to be
similar to this:

if (cannot find LF) {
if (cannot buffer more any more characters) {
/* who should we blame for our failure to parse this line? */

Http1::Tokenizer prefixTok(buf_);
if (!parseMethodField(prefixTok))
return -1; // blame a bad method

if (!skipDelimiter(prefixTok.skipAll(delimiters)))
return -1; // blame a bad method/URI delimiter

// assume it is the URI
debugs(...);
parseStatusCode = Http::scUriTooLong;
return -1;
}

debugs(74, 5, "Parser needs more data");
return 0;
}

And, after noticing that we always skip the delimiter after the method,
I would probably move that functionality into parseMethodField(), to
arrive at something like this:

if (cannot find LF) {
if (cannot buffer any more characters) {
/* who should we blame for our failure to parse this line? */

Http1::Tokenizer methodTok(buf_);
if (!parseMethodField(methodTok, delimiters))
return -1; // blame a bad method (or its delimiter)

// assume it is the URI
debugs(...);
parseStatusCode = Http::scUriTooLong;
return -1;
}

debugs(74, 5, "Parser needs more data");
return 0;
}



> +assert(parseStatusCode != Http::scNone);

Please no assert()s in new parsing code unless absolutely necessary. If
this code stays for some reason, please use Must() instead.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Incorrect processing of long URIs

2016-08-23 Thread Alex Rousskov
On 08/23/2016 03:26 AM, Eduard Bagdasaryan wrote:
> 2016-08-23 3:08 GMT+03:00 Alex Rousskov:
>> I do not understand why you decided to use maxMethodLength in
>> parseRequestFirstLine(). AFAICT, parseMethodField() already does
>> everything we need: It logs an error message and sets parseStatusCode
>> accordingly.

> Yes, it does partly what we need but it does not parse the delimiter(
> and inform about possible failure). 

... which is unrelated to the maxMethodLength move I questioned above.


> skipDelimiter() does this with message:
> 
>> invalid request-line: missing delimiter
> 
> which does not hint that our case is a 'bad' method 

If the method parses OK but the delimiter is missing, then our case can
be considered to be a case of a "missing delimiter" rather than of a
"bad method". It is all relative, of course: If a request line starts
with a few token characters, then it is impossible to say whether the
method suffix is invalid or the following delimiter -- it takes two to
tango.


> and less informative than:
> 
>> invalid request-line: method exceeds 32-byte limit

How is a "missing delimiter" error is less informative than "method
exceeds 32-byte limit", especially when the method does not actually
exceed that limit? [Rhetorical]


> I followed your sketch and refreshed the patch.

> +debugs(74, ErrorLevel(), "invalid request-line exceeds " <<
> +Config.maxRequestHeaderSize << "-byte limit");

bzr grep "invalid request-line" src

s/request-line/request-line: URI/ for consistency and clarity sake.


> +const CharacterSet &delimiters = DelimiterCharacters();

I wonder whether we should make this variable static to avoid repeated
function calls on a performance-sensitive code path. Same for the old
"delimiters" variable left inside parseRequestFirstLine(), of course.


These polishing touches can be done during commit.

+1

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Incorrect processing of long URIs

2016-08-23 Thread Alex Rousskov
On 08/23/2016 08:08 AM, Amos Jeffries wrote:
> A followup patch can be done to give skipDelimiter a 'const char* which'
> parameter that takes a description/name for the delimiter to improve
> that debug output.
> 
> so:
>   skipDelimiter(blah, "method")
> 
> produces:
>  invalid request-line: missing method delimiter

Good idea! I would

   s/which/where/
   s/missing  delimiter/missing delimiter /

because the URI field has two associated delimiters (left and right),
and a "where" parameter would be able to detail the missing delimiter
better:

  invalid request-line: missing delimiter after method
  invalid request-line: missing delimiter before "HTTP/1"

Same for the "too many delimiters" error, of course.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Coding standards

2016-08-23 Thread Alex Rousskov
On 08/23/2016 05:48 AM, Adam Majer wrote:

> What are the coding standards for Squid?

Just to add to Kinkie's correct response: We do not have a comprehensive
standard, unfortunately, but you can find a few requirements at

  http://wiki.squid-cache.org/SquidCodingGuidelines

(which should be updated to answer the C++11 question).


HTH,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Revalidate without Last-Modified

2016-08-23 Thread Alex Rousskov
On 08/23/2016 09:17 AM, Amos Jeffries wrote:
> On 24/08/2016 12:07 a.m., Eduard Bagdasaryan wrote:
>> 2016-08-21 15:58 GMT+03:00 Amos Jeffries :
>>> To change anything between those markers we have to do a full cache
>>> versioning and up/down-grade compatibility dance.

>> Could you please clarify what versioning problems you are talking
>> about?

> I was referring to that block of members being read and written
> directly to cache files. Either swap.state or the object metadata section.
> I dont think they are referenced individully, but as a raw-pointer to
> the first members address cast to some other identically laid out
> StoreMeta* type (yucky).


Yes, this happens in at least two places:

> bzr grep timestamp src | fgrep \&
...
> src/store_rebuild.cc:memcpy(&what->timestamp, x.value, 
> STORE_HDR_METASIZE);
> src/store_swapmeta.cc:t = 
> StoreMeta::Factory(STORE_META_STD_LFS,STORE_HDR_METASIZE,&e->timestamp)


> The point of those comments AFAIK is to make it obvious that kind of
> this is/was being done with them and prevent us doing what you did.

Agreed. Since improving this code is way outside this simple fix scope,
I recommend returning the renamed lastmod data member into that "keep
them together" section, with a comment:

time_t lastmod_; ///< received Last-Modified value or -1; treat as private

Please note that "received Last-Modified value or -1" is just my guess
what lastmod_ is -- use the right description if my guess is wrong.

Renaming lastmod (and temporary making it private) already forced you to
check and, if needed, adjust most of the old code using that data
member. Keeping lastmod_ public is not ideal, but it does not make
things worse and allows you to stay focused. Fixing related problems is
only a good idea when those fixes are simple and do not stray you off
course too much.


HTH,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Incorrect processing of long URIs

2016-08-24 Thread Alex Rousskov
On 08/24/2016 06:36 AM, Eduard Bagdasaryan wrote:
> 2016-08-23 18:01 GMT+03:00 Alex Rousskov
> :
> 
>> invalid request-line: missing delimiter before "HTTP/1"
> 
> In order to generate "where" with such detalization (i.e. the specific
> protocol version or method) 

The proposed detailing does not specify the exact protocol version or
method. It is just a constant c-string like "before HTTP/1" and "after
method".

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Incorrect processing of long URIs

2016-08-24 Thread Alex Rousskov
On 08/24/2016 08:30 AM, Amos Jeffries wrote:
> On 25/08/2016 12:36 a.m., Eduard Bagdasaryan wrote:
>> 2016-08-23 18:01 GMT+03:00 Alex Rousskov:
>>
>>> invalid request-line: missing delimiter before "HTTP/1"
>>
>> In order to generate "where" with such detalization (i.e. the specific
>> protocol version or method) we would need to pass skipDelimiter() the
>> parsed AnyP::ProtocolVersion or HttpRequestMethod objects, which would
>> require converting skipDelimiter() to a template:
>>
>>skipDelimiter(size_t, const char *where, const C
>> &what);
> 
> Two options there;
> 
> Option 1, is to use a fixed and slightly generic where message:
>"before protocol version"
> 
>   Pros: static debugs message, only used when actually needed.
> 
>   Cons: its generic rather than protocol specific.
> 
> 
> Option 2, is to use a PackableStream to construct that part of debug
> message in advance.
> 
>   Pros: its accurate and detailed
> 
>   Cons: extra memory and time required to generate the string when its
> probably never used.


> I think go with option 1 since this is a performance critical piece of code.

I agree but for a completely different reason:

* Effective error triage needs a lot more details than a specific method
name or protocol version -- a different problem we are not trying to
solve here.

* I do not consider error handling to be a performance critical piece of
code (and yes, we have already discussed whether error handling is
performance-critical, and we disagree on that).


Cheers,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Older response must not update

2016-08-24 Thread Alex Rousskov
On 08/24/2016 09:20 AM, Amos Jeffries wrote:

> in src/HttpReply.h:
> * please use doxygen syntax "\returns" instead of "returns" in the
> comment text.

No objection, but please note that the "returns..." phrase in isolation
does not fully describe what the method returns in this case. Only the
method description taken as a whole describes that.


> in src/client_side_reply.cc:
> * please remove the "handleIMSReply: " bit from touched debugs messages.

If Eduard does that, he should remove that bit from all related debugs
messages to keep them all consistent IMO.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Revalidate without Last-Modified

2016-08-25 Thread Alex Rousskov
On 08/25/2016 04:04 AM, Eduard Bagdasaryan wrote:

> Therefore, we could use the timestamp if Last-Modified is unavailable.

I do not understand why the patch hides the lastmod field behind a basic
getter. If we assert that a timestamp-based last modification value
should be used in many cases, then we need to force the calling code to
make an important choice between a raw lastmod and a timestamp-based
last modification value.

The patch does not force the caller to make that choice because
developers will naturally call lastModified() getter instead of
effectiveModificationTime(). If calling lastModified() produces more
problems like the one you are trying to fix (with reviewers not being
able to guess that the wrong method was called), then renaming lastmod_
accomplished nothing. Similarly, if calling lastModified() produces no
new problems (because the lastmod usage case you were fixing happened to
be exceptional rather than the norm), then renaming lastmod_ also
accomplished nothing.

The patch does use effectiveModificationTime() in several places already
so I suspect its usage is not that exceptional -- many, possibly even
most contexts should use effectiveModificationTime(). If you agree, then
we should make using lastmod_ getter more difficult -- the developer and
the reviewer should be forced to pay attention to such use because it is
likely to indicate a bug in new code (i.e., the new code should have
called effectiveModificationTime() instead).

Moreover, these raw lastmod abuse problems might be already surfacing in
your own patch! I see several raw lastmod value uses there:


1. Misleading debugging:

> +const time_t lastmod = http->storeEntry()->effectiveModificationTime();
> +http->request->lastmod = lastmod;
...
> -debugs(88, 5, "clientReplyContext::processExpired : lastmod " << 
> entry->lastmod );
> +debugs(88, 5, "lastmod " << entry->lastModified());

In other words, Squid is going to use the effective modification time,
but we are logging raw time to make triage harder.


2. Writing -1 to cache store metadata:

>  currentEntry(sd->addDiskRestore(key,
>  tmpe.timestamp,
...
> -tmpe.lastmod,
> +tmpe.lastModified(),

and

>  StoreSwapLogData s;
>  s.timestamp = e.timestamp;
...
> -s.lastmod = e.lastmod;
> +s.lastmod = e.lastModified();

and

> -basics.lastmod = from.lastmod;
> +basics.lastmod = from.lastModified();

Does a store need raw lastmod or effective modification time? I suspect
it is the latter, but perhaps I am missing some valid use cases for the
former. This needs a careful consideration.

If we are lucky, there is a relatively simple way to answer this
question: If the stored lastmod value is only used to re-initialize some
future StoreEntry::lastmod_ value _and_ all non-debugging uses of
StoreEntry::lastmod_ are going to be via effectiveModificationTime(),
then we can store effective modification time instead of -1!


3. Sending an HTCP message to another service.

> -hdr.putTime(Http::HdrType::LAST_MODIFIED, e->lastmod);
> +if (e && e->lastModified() > -1)
> +hdr.putTime(Http::HdrType::LAST_MODIFIED, e->lastModified());

Is this a conditional/revalidation request? If yes, then should we use
an effective modification time instead, like we do in use case #1?


4. StoreEntry state reporting.

>  describeTimestamps(const StoreEntry * entry)
...
>   (int) entry->timestamp,
...
> - (int) entry->lastmod,
> + (int) entry->lastModified(),

The debugging should reflect the true value so this is fine. However, we
can make this function a StoreEntry method to avoid the need for getter
_if_ this is the only place where the getter is needed.


If I did not miss any use cases, then #3 appears to be critical here. If
we should be using effective modification time in #3, then there appears
to be no need to store -1 in StoreEntry and your patch can be greatly
simplified. If #3 does need a raw value from StoreEntry, then we will
need a better way to force developers to think about the right method to
use.

Please double check what is going on in #3, and we will go from there.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Older response must not update

2016-08-25 Thread Alex Rousskov
On 08/25/2016 08:18 AM, Eduard Bagdasaryan wrote:
> 2016-08-24 18:20 GMT+03:00 Amos Jeffries :
> 
>> in src/LogTags.cc:
>> * instead of adding new enum entry please extend LogTags with a new bool
>> flag and the c_str() to append the "IGNORED" when that flag is true.

> Added a new LogTags::ignored for this.

The naming of things in logging code is exceptionally poor so please do
not feel upset that you got this wrong. The "_IGNORED" flag Amos is
talking about should go into LogTags::Errors.

I recommend renaming and re-documenting that subclass:

/// Things that may happen to a transaction while it is being
/// processed according to its LOG_* category. Logged as _SUFFIX(es).
/// Unlike LOG_* categories, these flags may not be mutually exclusive.
class Flags
{
...
   bool ignored; ///< _IGNORED: the response was not used for anything
   bool timedout; ///< _TIMEDOUT: terminated due to a lifetime or I/O
timeout
   bool aborted; ///< _ABORTED: other abnormal termination (e.g., I/O error)
} flags;

This does not fix all the problems, but reduces the confusion
considerably IMHO.


> I wonder whether the
> LogTags assignment operator should copy this member too.

Which assignment operator?

* The compiler-generated assignment operator will copy all members
automatically, of course.

* The explicitly-defined LogTags_ot conversion assignment operator is
already buggy or dangerous. It should be removed. Whether it was meant
to clear the flags, I do not know, unfortunately. I recommend leaving it
as is because fixing it is out of scope and may not be trivial. Adding
an "XXX: this operator does not reset flags" and "TODO: either replace
with a category-only setter or remove" comments would be appropriate.


HTH,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Incorrect processing of long URIs

2016-08-25 Thread Alex Rousskov
On 08/25/2016 10:26 AM, Amos Jeffries wrote:
> 2016-08-23 17:50 GMT+03:00 Alex Rousskov:
>> I wonder whether we should make this variable static to avoid repeated
>> function calls on a performance-sensitive code path.


> The output of DelimiterCharacters() cannot be stored in a static because
> it possibly changes with a reconfigure.

Agreed. My suggestion was wrong.


> Other than that it looks okay to me. +1.
>  Alex can you apply please with the above taken care of?

Committed to trunk (r14815).


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Incorrect processing of long URIs

2016-08-25 Thread Alex Rousskov
On 08/25/2016 10:26 AM, Amos Jeffries wrote:
> About
> the only further optimization we can do there is make the
> "CharacterSet::SP" that it outputs in the sensitive path be a local
> static *within* DelimiterCharacters() itself and return a reference to
> that instead of constructing a new CharaterSet each time.

Which new CharaterSet are we constructing each time in this context?
CharacterSet::SP is already a static and it is not copied during each
DelimiterCharacters() call, right? SP might not be initialized, but I do
not see an _optimization_ opportunity here.


Cheers,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] Sad performance trend

2016-08-26 Thread Alex Rousskov
Hello,

Factory ran a bunch of micro-level Polygraph tests, targeting
several Squid subsystems: header parsing, connection management, and SSL
(but no SslBump). We tested a few Squid versions going back to v3.1:

W1  W2  W3  W4  W5  W6
  v3.1  32% 38% 16% 48% 16+ 9%
  v3.3  23% 31% 14% 42% 15% 8%
  v3.5  11% 16% 12% 36%  7% 6%
  v4.0  11% 15%  9% 30% 14% 5%

Each W column represents a particular Polygraph workload (the workload
details are not important right now). Each percentage cell represents a
single test result for a given Squid version. Absolute numbers mean
little in these quick-and-dirty micro tests, but higher percentages are
better (100% would be comparable to a "wire" -- an ideal proxy with zero
overhead). Comparing numbers from different columns is virtually
meaningless because different workloads have different 100% levels.


If you follow each W column from top (v3.1 stabilizing in ~2010) to
bottom (current v4.0/trunk), you will notice that all studied Squid
subsystems are getting slower and slower. These micro tests exaggerate
the impact of specific subsystems and cannot predict real performance
degradation in a given environment, but things are clearly getting worse
with time. This is a sad trend!


Whether you blame it on an unfortunate initial code state, insufficient
project resources, lack of an official architect, mediocre development
skills, nearly absent quality controls, something else, or the
combination of several factors, we are clearly doing something wrong,
year after year after year...

And what is arguably worse than the bad trend itself, we did not even
_know_ about the problem (or at least its magnitude) until now.


I will not propose any solutions at this time, but will feed this
information into ongoing Squid Foundation discussions about revamping
Squid QA. Your feedback is welcomed, of course.


Thank you,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Revalidate without Last-Modified

2016-08-27 Thread Alex Rousskov
On 08/27/2016 05:22 AM, Eduard Bagdasaryan wrote:
> 2016-08-25 18:52 GMT+03:00 Alex Rousskov
> :
> 
>> 3. Sending an HTCP message to another service.
>>
>> > -hdr.putTime(Http::HdrType::LAST_MODIFIED, e->lastmod);
>> > +if (e && e->lastModified() > -1)
>> > +hdr.putTime(Http::HdrType::LAST_MODIFIED,
> e->lastModified());

>> Is this a conditional/revalidation request? If yes, then should we use
>> an effective modification time instead, like we do in use case #1?

> The code snippet is taken from htcpTstReply(), serving replies for TST
> requests (TST - test for the presence in the cache). If the entry is cached,
> this method fills HTCP reply packet with some of cached entry's headers (Age,
> Expires, Last-Modified etc.). According to the documentation, these headers 
> are
> added in order to give client an ability to calculate object's freshness 
> itself:
> HTCP response does not provide such information explicitly. On the other 
> hand, I
> have not found any strict requirements of what HTTP headers HTCP reply should
> contain.
> 
> Would the "effective" Last-Modified information help HTCP client to
> calculate object's freshness? It looks that some more information in this 
> case is
> better that lack of it.

Good question. I worry about the message recipient comparing received
effective LMT with the actual (absent!) LMT of the object they have in
the cache and then deciding that the resource has changed (and their
cached copy is now stale) because the resource now has LMT and it did
not have it before.

Not all HTCP clients are Squids, but how does Squid code treat such an
HTCP TST response? In other words, will Squid itself make a different
decision if it receives an effective LMT header value instead of no LMT
header field at all? Does your own patch affect the receiving Squid
behavior?


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Revalidate without Last-Modified

2016-08-27 Thread Alex Rousskov
On 08/27/2016 08:33 AM, Amos Jeffries wrote:

> If the response Squid would emit to the client proxy would contain a
> synthesized Last-Modified header - then the same synthetic value should
> be sent in HTCP.

I agree with that decision logic.


> I think Squid should be emitting a synthetic L-M header. So yes to
> effective modification time being sent (in both HTCP and HTTP responses).

I am worried that emitting effective/synthetic LMT would confuse some
clients that are currently working OK (possibly, but not necessarily,
including some Squids). After all, "unknown LMT" and "LMT of at most X"
are not exactly the same things, even if the protocol treats them the
same in many specific cases.

However, if Amos is sure that always emitting effective LMT [in cache
hits?] is a good idea, then I have no grounds for an objection.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Older response must not update

2016-08-27 Thread Alex Rousskov
On 08/27/2016 08:23 AM, Amos Jeffries wrote:
> On 26/08/2016 5:05 a.m., Alex Rousskov wrote:
>> I recommend renaming and re-documenting that subclass:
>>
>> /// Things that may happen to a transaction while it is being
>> /// processed according to its LOG_* category. Logged as _SUFFIX(es).
>> /// Unlike LOG_* categories, these flags may not be mutually exclusive.
>> class Flags
>> {
>> ...
>>bool ignored; ///< _IGNORED: the response was not used for anything
>>bool timedout; ///< _TIMEDOUT: terminated due to a lifetime or I/O
>> timeout
>>bool aborted; ///< _ABORTED: other abnormal termination (e.g., I/O error)
>> } flags;
>>
>> This does not fix all the problems, but reduces the confusion
>> considerably IMHO.


> Well, yes and no. 'Flags' is too generic. The sub-struct is called
> 'Errs' because it stores the data for the fail 'errors' that occured
> during processing. 

I am not sure we want to restrict flags to errors, and it is often
difficult to say whether something is an error. For example, a response
with a smaller Date value is not really an error from the
sender/protocol point of view but can be seen as an error by the end
user that wasted all that time waiting for it. Squid does not need to
take sides. It just needs to tell the admin what has happened.


> We will eventually have several groups of flags in
> similar sub-struct/class - grouped as per the wiki meaning groupings and
> documented along the lines of what the wiki says already.
> http://wiki.squid-cache.org/SquidFaq/SquidLogs#Squid_result_codes

FWIW, I doubt that wiki page is ready to represent consensus regarding
future implementation -- I see a few problems with the way tags are
classified right now, and it is not even clear whether most of the
groups contain mutually exclusive values (like enums) or flags (like
bitmaps or a bunch of boolean fields). If all but one group contains
enums, then having that special Flags "bitmap" group would make a lot of
sense.


> I'm fine with a name change if we can preserve the fail/error/abnormal
> event meaning in it.

If you insist, the committer can rename Flags back to Errors. I doubt
Errors is better or that it is the right direction, but I am not going
to argue about it.


>> * The explicitly-defined LogTags_ot conversion assignment operator is
>> already buggy or dangerous. It should be removed. Whether it was meant
>> to clear the flags, I do not know, unfortunately. I recommend leaving it
>> as is because fixing it is out of scope and may not be trivial.

> It is a setter for (only) the old "mutually exclusive" LOG_* category
> detail - it still seems like a realatively (but not perfect) way to set
> the category. The other flags are not erased by category having changed
> since the event that caused them to be set has still previously occured
> in the transaction. The only reason we would have for clearing those
> would be a retry/reforward which is not something the LogTags itself is
> aware of. The caller changing the category is expected to be aware of
> the old category value and preserve anything that needs preserving
> (which is part of what is wrong with the old categories way).

Thank you for explaining this. Based on the above, the TODO comment in
the patch can be made more specific to just say "replace with a
category-only setter". A comitter can do that.

The new XXX comment remains correct as is: Using an assignment operator
to _partially_ reset an object is just plain wrong because it violates
the expected assignment semantics.


In summary, if you are going to commit the patch, it is your call
whether to rename Flags. If you want me to commit, please let me know
what name to use for that class. And the new TODO comment should be
adjusted to say "replace with a category-only setter".


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Revalidate without Last-Modified

2016-08-30 Thread Alex Rousskov
On 08/30/2016 04:35 AM, Eduard Bagdasaryan wrote:
> 2016-08-28 1:12 GMT+03:00 Alex Rousskov : 
>> Not all HTCP clients are Squids, but how does Squid code treat such an
>> HTCP TST response?

> It seems that Squid does not care whether HTCP response contains LMT or
> not. Moreover, it looks that all TST response headers are only parsed
> inside htcpHandleTstResponse() without using them further for making
> any decisions

Wow. That finding combined with Amos assertion that
effectiveModificationTime() is always acceptable as the lastmod
value strongly suggests that there is no need to distinguish the two.
How about this:


   void lastModified(const time_t when) { lastModified_ = when; }
   time_t lastModified() const
   {
   // may still return -1 if timestamp is not set
   return lastModified_ < 0 ? timestamp : lastModified_;
   }

and completely removing effectiveModificationTime()?


I also think that we should rename/hide lastmod because it reduces the
chances that somebody will access the [negative] data member directly:


> /* START OF ON-DISK STORE_META_STD TLV field */
> time_t timestamp;
> time_t lastref;
> time_t expires;
> private:
> time_t lastModified_; ///< received Last-Modified value or -1; use 
> lastModified()
> public:
> uint64_t swap_file_sz;
> uint16_t refcount;
> uint16_t flags;
> /* END OF ON-DISK STORE_META_STD */


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] [PATCH] Reject or sanitize more problematic Content-Length values

2016-09-01 Thread Alex Rousskov
Hello,

Squid is violating HTTP MUSTs by forwarding messages with
problematic Content-Length values. Some of those bugs were fixed in
trunk r14215. This change handles multiple Content-Length values inside
one header field, negative values, and trailing garbage. Handling the
former required a change in the overall Content-Length interpretation
approach (which is why it was previously left as a TODO).

Squid now passes almost all Co-Advisor tests devoted to this area. We
are not 100% done though: We still need to handle malformed values with
leading signs (e.g., "-0" or "+1"). However, I hope that the remaining
problems are relatively minor. I do not plan on addressing them in the
foreseeable future.

Also improved httpHeaderParseOffset(): Added detection of overflowing
and underflowing integer values; polished malformed value detection code
(Linux strtoll(3) manual page has a good example). The function no
longer considers empty strings valid and reports trailing characters.
The function still accepts leading whitespace and signs. It is still the
wrong approach to HTTP numbers parsing, but further improvements are out
of scope because they are complicated and would require significant
caller rewrites.


HTH,

Alex.
Reject or sanitize more problematic Content-Length values.

Squid is violating HTTP MUSTs by forwarding messages with problematic
Content-Length values. Some of those bugs were fixed in trunk r14215.
This change handles multiple values inside one Content-Length header
field, negative values, and trailing garbage.

TODO: Handle malformed values with leading signs (e.g., "-0" or "+1").

Also improved httpHeaderParseOffset(): Added detection of overflowing
and underflowing integer values; polished malformed value detection
code. The function no longer considers empty strings valid and reports
trailing characters. The function still accepts leading whitespace and
signs. It is still the wrong approach to HTTP numbers parsing, but
further improvements are out of scope because they are complicated and
would require significant caller rewrites.

=== modified file 'src/HttpHeader.cc'
--- src/HttpHeader.cc	2016-08-18 12:43:27 +
+++ src/HttpHeader.cc	2016-09-01 23:10:20 +
@@ -1,34 +1,36 @@
 /*
  * Copyright (C) 1996-2016 The Squid Software Foundation and contributors
  *
  * Squid software is distributed under GPLv2+ license and includes
  * contributions from numerous individuals and organizations.
  * Please see the COPYING and CONTRIBUTORS files for details.
  */
 
 /* DEBUG: section 55HTTP Header */
 
 #include "squid.h"
+#include "base/CharacterSet.h"
 #include "base/EnumIterator.h"
 #include "base64.h"
 #include "globals.h"
+#include "http/one/Parser.h"
 #include "HttpHdrCc.h"
 #include "HttpHdrContRange.h"
 #include "HttpHdrScTarget.h" // also includes HttpHdrSc.h
 #include "HttpHeader.h"
 #include "HttpHeaderFieldInfo.h"
 #include "HttpHeaderStat.h"
 #include "HttpHeaderTools.h"
 #include "MemBuf.h"
 #include "mgr/Registration.h"
 #include "profiler/Profiler.h"
 #include "rfc1123.h"
 #include "SquidConfig.h"
 #include "StatHist.h"
 #include "Store.h"
 #include "StrList.h"
 #include "TimeOrTag.h"
 #include "util.h"
 
 #include 
 
@@ -38,40 +40,77 @@
 
 /*
  * On naming conventions:
  *
  * HTTP/1.1 defines message-header as
  *
  * message-header = field-name ":" [ field-value ] CRLF
  * field-name = token
  * field-value= *( field-content | LWS )
  *
  * HTTP/1.1 does not give a name name a group of all message-headers in a message.
  * Squid 1.1 seems to refer to that group _plus_ start-line as "headers".
  *
  * HttpHeader is an object that represents all message-headers in a message.
  * HttpHeader does not manage start-line.
  *
  * HttpHeader is implemented as a collection of header "entries".
  * An entry is a (field_id, field_name, field_value) triplet.
  */
 
+/// Finds the intended Content-Length value while parsing message-header fields.
+/// Deals with complications such as value lists and/or repeated fields.
+class ContentLengthInterpreter
+{
+public:
+explicit ContentLengthInterpreter(const int aDebugLevel);
+
+/// updates history based on the given message-header field
+/// \return true iff the field should be added/remembered for future use
+bool checkField(const String &field);
+
+/// intended Content-Length value if sawGood is set and sawBad is not set
+/// meaningless otherwise
+int64_t value;
+
+/* for debugging (declared here to minimize padding) */
+const char *headerWideProblem; ///< worst header-wide problem found (or nil)
+const int debugLevel; ///< debugging level for certain warnings
+
+/// whether a malformed Content-Length value was present
+bool sawBad;
+
+/// whether all remembered fields should be removed
+/// removed fields ought to be replaced with the intended value (if known)
+/// irrelevant if sawBad is set
+bool needsSanitizing;
+
+/// whether a valid field value was pres

Re: [squid-dev] [PATCH] Reject or sanitize more problematic Content-Length values

2016-09-02 Thread Alex Rousskov
On 09/02/2016 09:05 AM, Amos Jeffries wrote:
> On 2/09/2016 11:21 a.m., Alex Rousskov wrote:
>> This change handles multiple Content-Length values inside
>> one header field, negative values, and trailing garbage. Handling the
>> former required a change in the overall Content-Length interpretation
>> approach (which is why it was previously left as a TODO).


> +1. Looks okay to me. With one caveat:
> 
> Since there are several of these interpreters for special-cased headers
> I think we need to define a sub-namespace for header
> interpreters/parsers, etc. 

I am not sure I understand or agree with your logic for deciding when to
add a namespace. A namespace is primarily a name clash avoidance
mechanism. What name clashes are you trying to avoid by adding a "header
interpreters/parsers" namespace?

As an illustration, consider the standard library that provides a lot of
things, including a lot of similar things (e.g., stream classes or math
functions). How many namespaces does std contain? I knew of only a
couple before I looked it up. There are actually about a dozen
"standard" ones, but all the ones I checked are concerned with name
clashes, not just grouping similar things together.


> Does keeping the existing "Hdr" term seem like a good idea to you?

Not sure what you mean. Are you suggesting to rename
ContentLengthInterpreter to ContentLengthHdrInterpreter? That looks like
a clash of naming styles to me.


> * If so please move ContentLengthInterpreter to a Http::Hdr:: namespace
> and ContentLengthInterpreter.{h,cc} files in a matching directory.
> 
> * Otherwise, please at least place the class within the Http:: namespace
> and directory etc.

I recommend leaving ContentLengthInterpreter class name as is. The class
interprets Content-Length headers to find their intended meaning during
parsing. It is not meant to be an Http::ContentLength class that
stores/manages/writes parsed Content-Length information.

If that is not going to create linking problems, I can move the
ContentLengthInterpreter class into http/one/ContentLengthInterpreter.*
files and place it inside the existing Http::One namespace. Would that
be sufficient to address your concerns?


> (PS. I plan to bundle next 4.x within the next 2-3 days. I leave it to
> you to decide whether this gets applied now or after release).

I would prefer to commit now, but that requires reaching an agreement
regarding ContentLengthInterpreter name and location as discussed above.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Reject or sanitize more problematic Content-Length values

2016-09-02 Thread Alex Rousskov
On 09/02/2016 09:11 PM, Amos Jeffries wrote:

> I would realy like it to be under Http:: and in http/ the rest is okay
> to skip.

Sounds good. I have no problems with moving that code into http/ and
Http::. It is certainly appropriate, especially if you expect HTTP/2
code to benefit from this class. I hope the linker will not have any
problems with that move either.

Unfortunately, I may not be able to work this long weekend, so I may not
be able to do this until Tuesday and may miss your packaging deadline.
Feel free to commit/move this code yourself, of course.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Sad performance trend

2016-09-06 Thread Alex Rousskov
On 09/06/2016 08:27 AM, Amos Jeffries wrote:
> On 27/08/2016 12:32 p.m., Alex Rousskov wrote:
>> W1  W2  W3  W4  W5  W6
>>   v3.1  32% 38% 16% 48% 16+ 9%
>>   v3.3  23% 31% 14% 42% 15% 8%
>>   v3.5  11% 16% 12% 36%  7% 6%
>>   v4.0  11% 15%  9% 30% 14% 5%

> That trend goes all the way back to at least 2.6. Which is a bit weird,
> since it contradicts the total request-per-second capacity we have been
> watching in polygraph results.

I do not know what you mean by "total request-per-second capacity", but
I most likely have not been watching it. Is there a historical results
table for that measure that I can study?


> The speed per request goes down and the number that can be handled rises.

Even if concurrent transactions handling has been improving (and that is
a big if -- I have no data to back it up), it does not excuse the
significant worsening of per-transaction performance IMO.


> As far as my simple observations have been, the per-request slowdown
> correlates to the amount of code being added (ie new features) and also
> to the amount of AsyncCalls involved with the request handling (where
> more AsyncCall is more latency).

Additional code obviously slows things down, but the existing rate of
adding new features/async calls does not seem to match the magnitude of
the measured performance decline, especially for the basic code paths
tested by these micro tests. For things to be getting worse in such a
manner, we probably have to make the old code/features worse (in
addition to adding new features/async calls) and/or adding those new
features in a terribly inefficient way.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Reject or sanitize more problematic Content-Length values

2016-09-06 Thread Alex Rousskov
On 09/05/2016 09:31 PM, Amos Jeffries wrote:
> On 6/09/2016 8:52 a.m., Amos Jeffries wrote:
>> On 3/09/2016 5:48 p.m., Alex Rousskov wrote:
>>> On 09/02/2016 09:11 PM, Amos Jeffries wrote:
>>>
>>>> I would realy like it to be under Http:: and in http/ the rest is okay
>>>> to skip.
>>>
>>> Sounds good. I have no problems with moving that code into http/ and
>>> Http::. It is certainly appropriate, especially if you expect HTTP/2
>>> code to benefit from this class. I hope the linker will not have any
>>> problems with that move either.
>>>
>>> Unfortunately, I may not be able to work this long weekend, so I may not
>>> be able to do this until Tuesday and may miss your packaging deadline.
>>> Feel free to commit/move this code yourself, of course.
>>
>>
> 
> Applied as trunnk r15819

Thank you,

Alex.


___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] New Defects reported by Coverity Scan for Squid after IndependentRunner

2016-09-09 Thread Alex Rousskov
On 09/09/2016 07:34 AM, Christos Tsantilas wrote:
> On 09/09/2016 02:21 PM, Amos Jeffries wrote:
>> These issues are caused by the new RegisterRunner() design using
>> GetRidOfRunner(rr) if shutdown has already begun. That can potentially
>> result in the constructor of a class inheriting from IndependentRunner
>> deleting 'this', then the new'd object being used.


> However the GetRidOfRunner will never delete an IndependedRunner, so
> there is not a real problem here.

To be more precise, there is a Coverity problem here, not a Squid
problem, right?


>> I think what we should be doing is using Must(RegisterRunner(this))
>> instead of just RegisterRunner(this) for children of IndependentRunner
>> so their constructors throw on errors.

That does not sound right: AFAICT, the code logic does not really
require successful registration of [independent] runners so it would be
inappropriate to throw when registration fails. In other words, a failed
independent runner registration does not indicate a bug in the code AFAICT.

IIRC, the current RR API does not guarantee emitting (and the code
should not rely on receiving) any RR events. If we want to make that
guarantee, we would probably have to supply a list of events (that a
specific RR must receive) to the registration function. I do not think
we need it, but please let me know if I missed any good use cases for
that functionality.


>> Also the IndependentRunner::registerRunner() method is not used
>> anywhere. Was it supposed to be called by the child classes ?
>>  (IdleConnList and ConnStateData)

> Well, the registerRunner() method should be used instead of
> RegisterRunner for ConnStateData and IDleConnList. But this change
> somewhere lost while I was playing with the patches.

[rant] Which means our testing was inadequate again. Accidents happen,
but such a major bug should not have went unnoticed if proper testing
was in place.[/rant]


> However this is alone I think will not solve the problem.

Just to make sure we are on the same page: Which problem are you trying
to solve? A false USE_AFTER_FREE defect reported by Coverity?


> Maybe we need to re-implement  registerRunner() with something like:
> 
> IndependedRunner::registerRunner() {
> RegisterRunnerIgnoreOnShutdown();
> }
> 
> The RegisterRunnerIgnoreOnShutdown() is similar to RegisterRunner but it
> will not call GetRidOfRunner().

To avoid code duplication, you will need something like this:

void
RegisterRunnerUnlessShutDown(rr)
{
if (FindRunners())
RegisterRunner_(rr);
// else do nothing past finishShutdown
}

bool
RegisterOrGetRidOfRunner(rr)
{
if (FindRunners()) {
RegisterRunner_(rr);
return true;
} else {
GetRidOfRunner(rr);
return false; // past finishShutdown
}
}

where RegisterRunner_() is a static void function that never calls
GetRidOfRunner() and throws if it cannot FindRunners().

One advantage of this more verbose/explicit solution is that it
minimizes fears about using dynamic_cast<> on a being-constructed
object. I think the current code is actually OK there, but avoiding that
question is a good thing so I support this change.


Thank you,

Alex.




>> On 9/09/2016 5:45 a.m., scan-admin wrote:
>>>
>>> ** CID 1372673:  Memory - illegal accesses  (USE_AFTER_FREE)
>>> /src/servers/FtpServer.cc: 55 in Ftp::Server::Server(const
>>> RefCount &)()
>>>
>>>
>>> 
>>>
>>> *** CID 1372673:  Memory - illegal accesses  (USE_AFTER_FREE)
>>> /src/servers/FtpServer.cc: 55 in Ftp::Server::Server(const
>>> RefCount &)()
>>> 49 static bool CommandHasPathParameter(const SBuf &cmd);
>>> 50 };
>>> 51
>>> 52 Ftp::Server::Server(const MasterXaction::Pointer &xact):
>>> 53 AsyncJob("Ftp::Server"),
>>> 54 ConnStateData(xact),
>> CID 1372673:  Memory - illegal accesses  (USE_AFTER_FREE)
>> Dereferencing freed pointer "this".
>>> 55 master(new MasterState),
>>> 56 uri(),
>>> 57 host(),
>>> 58 gotEpsvAll(false),
>>> 59 onDataAcceptCall(),
>>> 60 dataListenConn(),
>>>
>>> ** CID 1372672:  Memory - illegal accesses  (USE_AFTER_FREE)
>>> /src/servers/Http1Server.cc: 27 in Http::One::Server::Server(const
>>> RefCount &, bool)()
>>>
>>>
>>> 
>>>
>>> *** CID 1372672:  Memory - illegal accesses  (USE_AFTER_FREE)
>>> /src/servers/Http1Server.cc: 27 in Http::One::Server::Server(const
>>> RefCount &, bool)()
>>> 21 #include "servers/Http1Server.h"
>>> 22 #include "SquidConfig.h"
>>> 23 #include "Store.h"
>>> 24
>>> 25 CBDATA_NAMESPACED_CLASS_INIT(Http1, Server);
>>> 26
>> CID 1372672:  Memory - illegal accesses  (USE_AFTER_FREE)
>> Dereferencing freed pointer "this".
>>> 27 Http::One::Server::Server(const MasterXaction::P

Re: [squid-dev] New Defects reported by Coverity Scan for Squid after IndependentRunner

2016-09-09 Thread Alex Rousskov
On 09/09/2016 11:21 AM, Christos Tsantilas wrote:
> On 09/09/2016 07:00 PM, Alex Rousskov wrote:
>> On 09/09/2016 07:34 AM, Christos Tsantilas wrote:
>>> On 09/09/2016 02:21 PM, Amos Jeffries wrote:
>>>> Also the IndependentRunner::registerRunner() method is not used
>>>> anywhere. Was it supposed to be called by the child classes ?
>>>>  (IdleConnList and ConnStateData)

>>> Well, the registerRunner() method should be used instead of
>>> RegisterRunner for ConnStateData and IDleConnList. But this change
>>> somewhere lost while I was playing with the patches.

>> [rant] Which means our testing was inadequate again. Accidents happen,
>> but such a major bug should not have went unnoticed if proper testing
>> was in place.[/rant]

> But it is not exactly bug. The registereRunner() method just calls
> "RegisterRunner(this)", it does not do anything else.
> Just we did not replace RegisterRunner(this); with the
> IndependedRunner::registerRunner call.

Ah, this is minor, and no functionality testing could expose this. I
incorrectly thought that we are not registering independent runners at all.

BTW, you can probably move the sketched RegisterRunnerUnlessShutdown()
function into the IndependedRunner::registerRunner() itself and also add
a Must(not an independent runner) check to the sketched
RegisterOrGetRidOfRunner().


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] New Defects reported by Coverity Scan for Squid after IndependentRunner

2016-09-10 Thread Alex Rousskov
On 09/10/2016 06:54 AM, Amos Jeffries wrote:
> On 10/09/2016 7:26 a.m., Alex Rousskov wrote:
>> On 09/09/2016 11:21 AM, Christos Tsantilas wrote:
>>> On 09/09/2016 07:00 PM, Alex Rousskov wrote:
>>>> On 09/09/2016 07:34 AM, Christos Tsantilas wrote:
>>>>> On 09/09/2016 02:21 PM, Amos Jeffries wrote:
>>>>>> Also the IndependentRunner::registerRunner() method is not used
>>>>>> anywhere.

>>>>> Well, the registerRunner() method should be used instead of
>>>>> RegisterRunner for ConnStateData and IDleConnList.

>>>> such a major bug should not have went unnoticed

>>> But it is not exactly bug.
>>> Just we did not replace RegisterRunner(this); with the
>>> IndependedRunner::registerRunner call.

>> Ah, this is minor, and no functionality testing could expose this. I
>> incorrectly thought that we are not registering independent runners at all.

> Well it is rare-ish, but I think it can happen with ConnStateData if a
> connection is accepted after shutdown starts during the grace period
> witing for client connections to finish up.

Yes, the registration may be "late", but the committed code supports
late registrations well AFAIK -- there is no functionality bug. There is
an unused method. I trust Christos will fix that (and improve
surrounding code while he is at it).

I misinterpreted Christos earlier response and thought that there is a
major bug that we missed. That major bug does not exist. The code works
OK but needs some polishing.


> What I was thinking was to have this:
> 
>  IndependedRunner::registerRunner() {
> Must(RegisterRunner(this));
>  }

It is wrong to throw when things are working OK. A failed RR
registration is not a bug; it is only a late registration that can be
ignored.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Sad performance trend

2016-09-12 Thread Alex Rousskov
On 09/12/2016 07:25 AM, Kinkie wrote:
> > On 27/08/2016 12:32 p.m., Alex Rousskov wrote:
> >> W1  W2  W3  W4  W5  W6
> >>   v3.1  32% 38% 16% 48% 16+ 9%
> >>   v3.3  23% 31% 14% 42% 15% 8%
> >>   v3.5  11% 16% 12% 36%  7% 6%
> >>   v4.0  11% 15%  9% 30% 14% 5%

> would it be possible
> to share these benchmarks with the larger community, 

Not sure what you mean by sharing benchmarks but I can, of course, share
detailed test results and [trivial] Polygraph workloads that were used
for these tests. However, they have no significant value on their own,
and their unpolished specifics are likely to distract us from the core
problem. What value do you see in publishing those dirty details?


> possibly by leveraging the automations we are already using?

Sorry, I do not know what that means. AFAICT, the Project (i.e., "we")
is not using any automated performance testing at this time (or it is so
broken that it does not even detect the obvious trend above). I do not
have enough cycles to fix and maintain that myself, but I have sent the
Foundation Board a specific proposal on how to address that problem.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Sad performance trend

2016-09-12 Thread Alex Rousskov
On 09/12/2016 09:38 AM, Amos Jeffries wrote:
> On 7/09/2016 5:43 a.m., Alex Rousskov wrote:
>> On 09/06/2016 08:27 AM, Amos Jeffries wrote:
>>> On 27/08/2016 12:32 p.m., Alex Rousskov wrote:
>>>> W1  W2  W3  W4  W5  W6
>>>>   v3.1  32% 38% 16% 48% 16+ 9%
>>>>   v3.3  23% 31% 14% 42% 15% 8%
>>>>   v3.5  11% 16% 12% 36%  7% 6%
>>>>   v4.0  11% 15%  9% 30% 14% 5%

> Since the test was a bit unreliable I ran it freshly against each branch
> that would build when I wanted to check progress.
> The last test run can be found in parserng-polygraph if you want to dig
> into the logs for other measures.

Sorry, I do not know how to get to "parserng-polygraph". All Jenkins
results published at http://build.squid-cache.org/ appear to be too old,
but perhaps I am looking in the wrong place. It is probably not
important for this specific discussion though.


> branch: Mean RPS
> 
> squid-3.2 : 1933.98
> squid-3.3 : 1932.81
> squid-3.4 : 1931.12
> squid-3.5 : 1926.13

Looks like performance may be gradually getting worse in this macro test
as well, but the decrease is not as visible/pronounced as in micro tests
(which is understandable/expected, of course).


> The fluctuation / error bars seemed to be about 1 RPS for that polygraph
> workload.

Please correct me if I am misinterpreting what you are saying, but to me
it sounds like "The results are not getting gradually worse because each
result has a ~1 rps precision". That conclusion does not compute for me
for two independent reasons:

1. There is a significant difference between results fluctuation and a
trend. Individual test results fluctuation may co-exist with a
real/meaningful downward trend. Whether that happens here, I do not
know, but your comment appears to imply that you are dismissing the
visible trend for the wrong/inapplicable reason.

2. Even if 1 unit difference is insignificant, the results show that the
performance got worse by 7+ units (~1934 vs ~1926).

BTW, I would caution against thinking of these numbers as RPS. That test
is not designed to predict sustained request rates. A 1 unit difference
in these results may correspond to ~0% or, say, ~10% difference in
actual sustained request rates.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Squid-4 release checklist

2016-09-12 Thread Alex Rousskov
On 09/12/2016 09:54 AM, Amos Jeffries wrote:

> * 
>   Windows Update works via interception ptoxy on 3.5.17, and no works
> via transparent proxy on Squid 4.x.
> 
>  - FWIW; others have been mentioning various issues with various Squid
> versions ever since Windows 10 came out.
>  - I am inclined to downgrade this unless someone can reproduce it.

Why "downgrade"? The importance of a bug should not go _down_ after
confirming (or assuming) that the bug affects several Squid versions.
You might ignore the old bug for the purpose of assigning various labels
to Squid branches, but downgrading it for that reason seems inappropriate.


> * 
>   CloudFlare and hosted sites cannot be connected via SSL\TLS(with DH)
> with NONE/503
> 
>  - stuck waiting sufficient information to identify what the problem
> actually is. The traces we have so far indicate the network in question
> is very noisy with TCP packet retries, duplications, checksum errors,
> and outright delivery failures.
>  - I am inclined to downgrade unless anyone else is able to reproduce
> the issue.

Downgrading because others cannot reproduce seems inappropriate.
Treating the bug as UNCONFIRMED would the right cause of action if
nobody but Yuri can confirm it. The current MOREINFO state can also be
used as an excuse to ignore this bug for the purpose of assigning
various labels to Squid branches *if* you think that there is actually
no bug there (and that getting more info will only confirm that).


> * 
>   Memory hits shadow disk entries and vice versa
> 
>  - will downgrade unless someone intends to work on it.

Downgrading bug severity because nobody is working on a fix is wrong.

Please note that Eduard is working on fixing that bug, and the bug
metadata (i.e., the Assignee field) reflects that fact (to the extent
our Bugzilla can reflect it). We are on the tenth patch revision, but
are probably approaching usable state.


HTH,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Incorrect logging of request size

2016-09-12 Thread Alex Rousskov
On 09/01/2016 03:44 PM, Eduard Bagdasaryan wrote:
> Hello,
> 
> This patch fixes logged request size (%http::>st) and other size-related
> %codes.
> 
> The %[http::]>st logformat code should log the actual number of
> [dechunked] bytes received from the client. However, for requests with
> Content-Length, Squid was logging the sum of the request header size and
> the Content-Length header field value instead. The logged value was
> wrong when the client sent fewer than Content-Length body bytes.
> 
> Also:
> 
> * Fixed %http::>h and %http::   was focusing on preserving the "request header" (i.e. not "response
>   header") meaning of the %http::>h logformat code, but since ICAP
>   services deal with (and log) both HTTP requests and responses, that
>   focus on the HTTP message kind actually complicates ICAP logging
>   configuration and will eventually require introduction of new %http
>   logformat codes that would be meaningless in access.log context.
> 
>   - In ICAP context, %http::>h should log to-be-adapted HTTP message
> headers embedded in an ICAP request (HTTP request headers in REQMOD;
> HTTP response headers in RESPMOD). Before this change, %http::>h
> logged constant/"FYI" HTTP request headers in RESPMOD.
> 
>   - Similarly, %http:: embedded in an ICAP response (HTTP request headers in regular
> REQMOD; HTTP response headers in RESPMOD and during request
> satisfaction in REQMOD). Before this change, %http:: REQMOD.
> 
>   In other words, in ICAP logging context, the ">" and "<" characters
>   should indicate ICAP message kind (where the being-logged HTTP message
>   is embedded), not HTTP message kind, even for %http codes.
> 
>   TODO: %http::>h code logs "-" in RESPMOD because AccessLogEntry does
>   not store virgin HTTP response headers.
> 
> * Correctly initialize ICAP ALE HTTP fields related to HTTP message
>   sizes for icap_log, including %http::>st, %http::sh, and
>   %http::>sh format codes.
> 
> * Initialize HttpMsg::hdr_sz in HTTP header parsing code. Among other
>   uses, this field is needed to initialize HTTP fields inside ICAP ALE.
> 
> * Synced default icap_log format documentation with the current code.


I cannot track the effects of all the proposed low-level fixes, of
course, but no changes jumped at me as wrong. IIRC, Eduard tried quite a
few test cases in hope to weed out the bugs. The underlying code was in
poor shape, on several levels. This patch does not solve all the
problems, but at least Squid should stop lying about several sizes that
folks are often monitoring (and some icap.log codes should be
interpreted in a more consistent/orderly fashion).

I have not seen any public reviews for this patch. If there are no
last-minute objections, I will commit these important fixes to trunk soon.


Cheers,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Incorrect logging of request size

2016-09-13 Thread Alex Rousskov
On 09/12/2016 10:06 PM, Amos Jeffries wrote:
> Just the new cf.data.pre docs for icap_log contradicting itself:
> 
> "
>   http::>h...
>   HTTP response headers in RESPMOD) ...
>   currently does not support logging of HTTP response headers in
> RESPMOD ...
> "
> 
> I think that should probably be saying it does not support HTTP
> *request* headers in RESPMOD.

I assume you are talking about this blob:

> http::>h 
> To-be-adapted HTTP message headers sent by Squid to
> the ICAP service (HTTP request headers in REQMOD; HTTP
> response headers in RESPMOD). Please note that Squid
> currently does not support logging of HTTP response
> headers in RESPMOD for this format code.

I will clarify that when committing:

http::>h   To-be-adapted HTTP message headers sent by Squid to the ICAP
service (i.e., HTTP request headers in REQMOD or HTTP response headers
in RESPMOD). However, Squid cannot currently log HTTP response headers
sent to the ICAP service (i.e., %http::>h will expand to "-" for RESPMOD
transactions).

Does that sound better?


If the above polished version does not clarify things enough, here is
what is going on:

* The logformat %code is for logging HTTP message headers sent by Squid
to the ICAP service. That will not change and is now properly
documented. What are those to-be-adapted HTTP headers? They are HTTP
request headers in REQMOD and HTTP response headers in RESPMOD. We did
not have to say that, but providing that detail is helpful IMO.

* However, currently, Squid cannot log HTTP response headers sent by
Squid to the ICAP service. The admin will always see "-" in that
specific case. Hopefully, that support will be added in the future. For
now, we just document the lack of support in that particular case.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Squid-4 release checklist

2016-09-13 Thread Alex Rousskov
On 09/13/2016 12:02 AM, Amos Jeffries wrote:
> On 13/09/2016 6:22 a.m., Alex Rousskov wrote:
>> On 09/12/2016 09:54 AM, Amos Jeffries wrote:
>>
>>> * <http://bugs.squid-cache.org/show_bug.cgi?id=4514>
>>>   Windows Update works via interception ptoxy on 3.5.17, and no works
>>> via transparent proxy on Squid 4.x.
>>>
>>>  - FWIW; others have been mentioning various issues with various Squid
>>> versions ever since Windows 10 came out.
>>>  - I am inclined to downgrade this unless someone can reproduce it.

>> Why "downgrade"? The importance of a bug should not go _down_ after
>> confirming (or assuming) that the bug affects several Squid versions.

> Agreed, which is why that second statement. IF someone can reproduce
> ... Then its a major bug (affecting many people using Squid),
> otherwise it is limited to the one network which other evidence shows is
> broken in many ways down to and below the TCP level.

I have no problems with what you are saying now. I still have problems
with downgrading bug severity level if nobody can reproduce. Perhaps
your "downgrade" means something else or perhaps we are using different
strategies to defining what bug importance/severity field represents.
More on that below.


> The severity rating for the project needs to be
> estimated/determined by how widely it can be expected to affect the
> Squid community.

Please note that I was and still am discussing specific bug metadata,
such as the importance/severity rating field, not any derived metrics.

There are at least two strategies when dealing with the
importance/severity rating field:

* Adjusting severity rating based on the bug spread is a valid strategy.

* The alternative strategy is to separate severity from "reproducibility
in multiple environments". I favor this strategy because it is simpler
and because it reduces tensions between those who suffer from the bug
and those who maintain bug metadata.


With _either_ strategy, when it comes to deciding what to do about the
bug as far as marking Squid branch as stable is concerned, it is too
easy and too convenient for us to misinterpret the absence of "me too"
bug comments as a sign that a bug spread is narrow. Folks suffering from
these bugs may not read your squid-dev post implicitly threatening them
to "ignore" the bugs they are suffering from. I am not saying there are
such folks for this particular bug. I am saying that the approach itself
is dangerous. To reduce that danger, we should be posting these warnings
to squid-users (and to each affected bug report).


>>> * <http://bugs.squid-cache.org/show_bug.cgi?id=4497>
>>>   CloudFlare and hosted sites cannot be connected via SSL\TLS(with DH)
>>> with NONE/503
>>>
>>>  - stuck waiting sufficient information to identify what the problem
>>> actually is. The traces we have so far indicate the network in question
>>> is very noisy with TCP packet retries, duplications, checksum errors,
>>> and outright delivery failures.
>>>  - I am inclined to downgrade unless anyone else is able to reproduce
>>> the issue.

>> Downgrading because others cannot reproduce seems inappropriate.
>> Treating the bug as UNCONFIRMED would the right cause of action if
>> nobody but Yuri can confirm it. The current MOREINFO state can also be
>> used as an excuse to ignore this bug for the purpose of assigning
>> various labels to Squid branches *if* you think that there is actually
>> no bug there (and that getting more info will only confirm that).


> Some context:
> * only affecting one person AFAIK,
> * on a network with visibly borked TCP layer,
> * running proxy on a uncommon proprietary OS (Solaris),
> * with custom patches IIRC,
> * person has a history of reporting almost all bugs as
> blocker/critical/major to get attention,
> * person has a history of actively inhibiting bug resolution when
> frustrated by progress.
> 
> Why penalize the whole world for one admins somewhat unusual conditions?

Nobody has proposed to penalize the whole world. I am objecting to
adjusting bug metadata (your "downgrading") without sufficient grounds
for doing so. I am _not_ objecting to ignoring that bug for the purpose
of assigning various labels to Squid branches (with sufficient grounds
for doing so). The two actions are very different IMO.

The "context" bullets you provide can be the reasons to ignore this bug
while deciding whether to mark v4.0 as stable. They are not the reasons
to change bug metadata (using the metadata maintenance strategy I
recommend, as discussed above).


>>> * <http://bugs.squid-cache.org/show_bug.cgi?id=4505>
>>>   Memory hits shadow disk entries 

Re: [squid-dev] [PATCH] Incorrect logging of request size

2016-09-13 Thread Alex Rousskov
On 09/13/2016 10:02 AM, Amos Jeffries wrote:
> On 14/09/2016 2:52 a.m., Alex Rousskov wrote:
>>> http::>h 
>>> To-be-adapted HTTP message headers sent by Squid to
>>> the ICAP service (HTTP request headers in REQMOD; HTTP
>>> response headers in RESPMOD). Please note that Squid
>>> currently does not support logging of HTTP response
>>> headers in RESPMOD for this format code.

>> I will clarify that when committing:
>>
>> http::>h   To-be-adapted HTTP message headers sent by Squid to the ICAP
>> service (i.e., HTTP request headers in REQMOD or HTTP response headers
>> in RESPMOD). However, Squid cannot currently log HTTP response headers
>> sent to the ICAP service (i.e., %http::>h will expand to "-" for RESPMOD
>> transactions).
>>
>> Does that sound better?


> Better, but I think remove the words "or HTTP response headers in
> RESPMOD" from that first statement. The second sentence now explains the
> RESPMOD clearly all by itself.


Committed to trunk (r14838) with the following text:

> http::>h
> To-be-adapted HTTP message headers sent by Squid to
> the ICAP service. For REQMOD transactions, these are
> HTTP request headers. For RESPMOD, these are HTTP
> response headers, but Squid currently cannot log them
> (i.e., %http::>h will expand to "-" for RESPMOD).


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] [RFC] dns_wait_for_all

2016-09-14 Thread Alex Rousskov
Hello,

Currently, when connecting to an origin server, Squid sends
concurrent DNS A and  queries and waits for both answers before
proceeding with the HTTP transaction. If the authoritative DNS server
(or something on its path) breaks or significantly delays IPv6 ()
transactions, then Squid waits until timeout, even if Squid already has
a usable IPv4 address from a successful A query. This naturally leads to
admins disabling IPv6, willingly or under external pressure.

As Happy Eyeballs algorithms and related discussions/measurements have
established, the best migration path to IPv6 requires making sure that
enabling IPv6 does not create [user-visible] problems. Once that is
accomplished, software can and should prefer IPv6 over IPv4 by default.
I hope that we do not need to revisit those discussions to accept that
principle.

Currently, Squid violates that principle -- enabling IPv6 leads to
user-visible problems, and those problems lead to IPv6 being disabled.


This will not fix all IPv6 problems, but I propose to modify Squid so
that it starts connecting after receiving the first usable DNS response:

> dns_wait_for_all 
> 
> Determines whether Squid resolves domain names of all possible
> destinations in all supported address families before deciding which
> IP address to try first when contacting an origin server or cache_peer.
> 
> Before Squid can connect to a peer, it needs an IP address. Obtaining an
> IP address often requires a DNS lookup. Squid often makes two concurrent
> DNS lookups: An "A" query for an IPv4 address and an "" query for an
> IPv6 address. This directive does not affect the number of DNS queries
> sent or the side-effects of those queries (e.g., IP cache updates), but
> if two concurrent lookups are initiated, and this directive is off, then
> Squid proceeds immediately after receiving the first usable DNS answer.
> 
> This directive does not affect forwarding retries. For example, if
> dns_wait_for_all is off, and Squid gets an IPv4 address first, but the
> TCP connection to that IPv4 address fails, Squid will wait for the IPv6
> address resolution to complete (if it has not yet) and will then connect
> to an IPv6 address (if possible).
> 
> Furthermore, this directive does not affect the number of peer domain
> names that Squid will attempt to resolve or peer addresses that Squid
> may connect to. If Squid is allowed to forward a transaction to two
> peers, then Squid will resolve both peer names and, if failures make it
> necessary, will connect to all IP addresses of both peers (subject to
> other restrictions such as connect_retries).
> 
> See also: dns_v4_first

I suggest to enable this option by default because it will help with
IPv6 adoption, but I certainly do not insist on that default.


While we call both queries "concurrent", Squid sends the  query just
before sending the A query. All other factors being equal, IPv6 will
usually win the DNS race. However, even if  loses, Squid will use
IPv6 the next time it needs to connect to the same server.


From development point of view, support this feature properly means
creating an AsyncJob that will initiate DNS queries and update the
destinations list as the answers come in while informing the caller (if
it is still alive) of any new answers. Today, FwdState does
approximately this:

  1. Call peerSelect(&serverDestinations, fwdPeerSelectionComplete)
 and wait for the fwdPeerSelectionComplete callback.

  2. When fwdPeerSelectionComplete is called,
 start iterating over pre-filled serverDestinations.

To support, dns_wait_for_all, FwdState will do _approximately_ this:

  1. Call peerSelect(serverDestinations, fwdPeerSelected)
 and wait for the first fwdPeerSelected subscription callback.

  2. Every time fwdPeerSelected is called,
 start or resume iterating still-unused serverDestinations
 if we were actually waiting for the next destination to try.

The DNS code dealing with concurrent A and  queries will need to be
adjusted to report the first answer while waiting for the second one.

It is questionable whether the new AsyncJob should continue running
(i.e., resolving more peer names) after FwdState is gone (or no more
retries are needed). However, I do not want to complicate this
discussion by introducing that side effect. We can decide to optimize
that later, with or without another configuration option to control the
behavior.

Once this new infrastructure is in place, we can accommodate IPv6
further by experimenting with limited semi-concurrent TCP connections
and other Happy Eyeballs-inspired tricks (with proxy specifics in mind).


Any better ideas or objections to adding dns_wait_for_all?


Thank you,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] dns_wait_for_all

2016-09-14 Thread Alex Rousskov
On 09/14/2016 07:26 PM, Amos Jeffries wrote:
> On 15/09/2016 8:15 a.m., Alex Rousskov wrote:
>> Any better ideas or objections to adding dns_wait_for_all?


> In principle okay. However, I was intending to redesign the object we
> store DNS RR results in to achieve this is a more useful way. 

If you are talking about Comm::ConnectionList/serverDestinations, then
that object will have to be redesigned to support this proposal, of
course. The object would also have to be safely shared across FwdState,
peer selection code, and DNS code for the proposed feature to work.


> * slow DNS responses could add to the chain while earlier results are
> still in Async queue pending delivery to a caller.

... and while the caller is going through the TCP connect steps. That
TCP connect may be unsuccessful, requiring another attempt, possibly
using a different address. We already have the retrying code in place
and I am not proposing to change those algorithms during this project,
but the timing of new destinations arrival will change.


>  - this would obsolete the need for dns_wait_for_all to be configurable.

I have nothing specific against making the proposed algorithm the only
one, but I am not sure that _all_ side effects of the current
resolve-everything-first code are useless in all environments. AFAICT,
supporting both algorithms is easy (it is supporting the proposed
algorithm that is difficult).


> * ip/fqdn caches can store older data up to some timeout in TTL / LRU
> sequence.

I am not sure the DNS cache should be affected by the new lookup results
handling algorithm or the new serverDestinations storage of those
results. AFAICT, the DNS cache should store each result independently
while serverDestinations stores an ordered collection of results (some
of which may come from the DNS cache).

If you think that caching must change as a part of this project, can you
detail the related caching changes you would expect or foresee? If
caching changes are mostly independent, then we can discuss them
later/separately (to stay focused).


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] dns_wait_for_all

2016-09-15 Thread Alex Rousskov
On 09/15/2016 03:50 AM, Amos Jeffries wrote:
> On 15/09/2016 5:11 p.m., Alex Rousskov wrote:
>> On 09/14/2016 07:26 PM, Amos Jeffries wrote:
>>> On 15/09/2016 8:15 a.m., Alex Rousskov wrote:
>>>> Any better ideas or objections to adding dns_wait_for_all?
>>
>>
>>> In principle okay. However, I was intending to redesign the object we
>>> store DNS RR results in to achieve this is a more useful way. 
>>
>> If you are talking about Comm::ConnectionList/serverDestinations, then
>> that object will have to be redesigned to support this proposal, of
>> course. The object would also have to be safely shared across FwdState,
>> peer selection code, and DNS code for the proposed feature to work.
> 
> I mean Dns::LookupDetails, ipcache_addrs and rfc1035_rr.

> Something like:
> 
> * ipcache_addrs::in_addrs becomes a list of rfc1035_rr instead of array
> if IP address. (or an equivalent type storing the same details in easier
> to access type than 'rdata' uses).
> 
> * ipcache_addrs::bad_mask moved into rfc1035_rr for a mask on the
> records in each RR separately.
> 
> * ipcache_addrs merged into Dns::LookupDetails.

That sounds mostly unrelated to the problem at hand. If we need to alter
Dns::LookupDetails, ipcache_addrs, or rfc1035_rr in some significant
way, then we will come back to this discussion. Otherwise, it stays as a
separate/independent not-yet-discussed project.


> The serverDestinations not changing (yet).

I am pretty sure we have to change that field to implement
dns_wait_for_all functionality correctly. For example:

* we need to make its updates (from different jobs!) safe, with
easy-to-trace debugging.

* we need to support an EOF-like semantics to signal whether more
destinations may still be provided.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Incorrect logging of request size

2016-09-15 Thread Alex Rousskov
On 09/15/2016 12:25 PM, Eduard Bagdasaryan wrote:
> 2016-09-13 20:42 GMT+03:00 Alex Rousskov
> :
>> Committed to trunk (r14838)
> 
> I am attaching v3.5 port of this r14838 and also r14839 and r14840,
> containing several related fixes.

Amos, are you willing to include this fix in v3.5? It is fairly small,
but ModXact::finalizeLogInfo() changes are difficult (for me) to check.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] FYI: trunk ContextPtr changes

2016-09-19 Thread Alex Rousskov
On 09/18/2016 08:04 PM, Amos Jeffries wrote:
> I have split the work into batches to prevent it being one huge hairy
> change diff that nobody can read.

AFAICT, the correct way to achieve your goals is to merge a branch
containing those "batches" as commits. That way, you get a single
top-level trunk commit that does not bother anybody (or can be undone if
it does) plus individual secondary commits that are available if
somebody needs to analyze the individual changes.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] [PATCH] Bug 3819: "fd >= 0" assertion in file_write() during reconfiguration

2016-09-19 Thread Alex Rousskov
Hello,

The attached trunk patch fixes bug 3819 in our tests.

Please note that the underlying problem may lead to other assertions,
including "NumberOfUFSDirs" in UFSSwapDir and "fd >= 0" in file_close().

Since trunk r9181.3.1, reconfiguration is done in at least two steps:
First, mainReconfigureStart() closes/cleans/disables all modules
supporting hot reconfiguration and then, at the end of the event loop
iteration, mainReconfigureFinish() opens/configures/enables them again.
UFS code hits assertions if it has to log entries or rebuild swap.state
between those two steps.  The patch disables or delays those activities
during reconfiguration.

The tiny assertion opportunity window hides these bugs from most proxies
that are not doing enough disk I/O or are not being reconfigured
frequently enough.


HTH,

Alex.

Bug 3819: "fd >= 0" assertion in file_write() during reconfiguration

Other possible assertions include "NumberOfUFSDirs" in UFSSwapDir and
"fd >= 0" in file_close().

Since trunk r9181.3.1, reconfiguration is done in at least two steps:
First, mainReconfigureStart() closes/cleans/disables all modules
supporting hot reconfiguration and then, at the end of the event loop
iteration, mainReconfigureFinish() opens/configures/enables them again.
UFS code hits assertions if it has to log entries or rebuild swap.state
between those two steps. The tiny assertion opportunity window hides
these bugs from most proxies that are not doing enough disk I/O or are
not being reconfigured frequently enough.

Asynchronous UFS cache_dirs such as diskd were the most exposed, but
even blocking UFS caching code could probably hit [rebuild] assertions.

The swap.state rebuilding (always initiated at startup) probably did not
work as intended if reconfiguration happened during the rebuild time
because reconfiguration closed the swap.state file being rebuilt. We now
protect that swap.state file and delay rebuilding progress until
reconfiguration is over.

TODO: To squash more similar bugs, we need to move hot reconfiguration
support into the modules so that each module can reconfigure instantly.

=== modified file 'src/fs/ufs/RebuildState.cc'
--- src/fs/ufs/RebuildState.cc	2016-04-03 23:41:58 +
+++ src/fs/ufs/RebuildState.cc	2016-09-16 01:17:04 +
@@ -74,43 +74,45 @@ Fs::Ufs::RebuildState::RebuildState(RefC
 
 if (!clean)
 flags.need_to_validate = true;
 
 debugs(47, DBG_IMPORTANT, "Rebuilding storage in " << sd->path << " (" <<
(clean ? "clean log" : (LogParser ? "dirty log" : "no log")) << ")");
 }
 
 Fs::Ufs::RebuildState::~RebuildState()
 {
 sd->closeTmpSwapLog();
 
 if (LogParser)
 delete LogParser;
 }
 
 void
 Fs::Ufs::RebuildState::RebuildStep(void *data)
 {
 RebuildState *rb = (RebuildState *)data;
-rb->rebuildStep();
+if (!reconfiguring)
+rb->rebuildStep();
 
-if (!rb->isDone())
+// delay storeRebuildComplete() when reconfiguring to protect storeCleanup()
+if (!rb->isDone() || reconfiguring)
 eventAdd("storeRebuild", RebuildStep, rb, 0.01, 1);
 else {
 -- StoreController::store_dirs_rebuilding;
 storeRebuildComplete(&rb->counts);
 delete rb;
 }
 }
 
 /// load entries from swap.state or files until we run out of entries or time
 void
 Fs::Ufs::RebuildState::rebuildStep()
 {
 currentEntry(NULL);
 
 // Balance our desire to maximize the number of entries processed at once
 // (and, hence, minimize overheads and total rebuild time) with a
 // requirement to also process Coordinator events, disk I/Os, etc.
 const int maxSpentMsec = 50; // keep small: most RAM I/Os are under 1ms
 const timeval loopStart = current_time;
 

=== modified file 'src/fs/ufs/UFSSwapDir.cc'
--- src/fs/ufs/UFSSwapDir.cc	2016-04-03 23:41:58 +
+++ src/fs/ufs/UFSSwapDir.cc	2016-09-16 01:05:19 +
@@ -300,41 +300,42 @@ Fs::Ufs::UFSSwapDir::init()
 void
 Fs::Ufs::UFSSwapDir::create()
 {
 debugs(47, 3, "Creating swap space in " << path);
 createDirectory(path, 0);
 createSwapSubDirs();
 }
 
 Fs::Ufs::UFSSwapDir::UFSSwapDir(char const *aType, const char *anIOType) :
 SwapDir(aType),
 IO(NULL),
 fsdata(NULL),
 map(new FileMap()),
 suggest(0),
 l1(16),
 l2(256),
 swaplog_fd(-1),
 currentIOOptions(new ConfigOptionVector()),
 ioType(xstrdup(anIOType)),
 cur_size(0),
-n_disk_objects(0)
+n_disk_objects(0),
+rebuilding_(false)
 {
 /* modulename is only set to disk modules that are built, by configure,
  * so the Find call should never return NULL here.
  */
 IO = new Fs::Ufs::UFSStrategy(DiskIOModule::Find(anIOType)->createStrategy());
 }
 
 Fs::Ufs::UFSSwapDir::~UFSSwapDir()
 {
 if (swaplog_fd > -1) {
 file_close(swaplog_fd);
 swaplog_fd = -1;
 }
 xfree(ioType);
 delete map;
 delete IO;
 delete currentIOOptions;
 }
 
 void
@@ -708,78 +709,81 @@ Fs::Ufs::UFSSwapDir::logFile(char cons

Re: [squid-dev] [RFC] dns_wait_for_all

2016-09-20 Thread Alex Rousskov
On 09/20/2016 04:41 AM, Amos Jeffries wrote:
> On 16/09/2016 3:35 a.m., Alex Rousskov wrote:
>> On 09/15/2016 03:50 AM, Amos Jeffries wrote:
>>> The serverDestinations not changing (yet).

>> I am pretty sure we have to change that field to implement
>> dns_wait_for_all functionality correctly. For example:
>>
>> * we need to make its updates (from different jobs!) safe, with
>> easy-to-trace debugging.
>>
>> * we need to support an EOF-like semantics to signal whether more
>> destinations may still be provided.


> I think you are mistaking serverDestinations as being part of the DNS
> operations. It is not.

I am not thinking in those terms. serverDestinations today is populated
by peer selection code, based on DNS lookups (including DNS cache
lookups). Those DNS lookups create two problems that dns_wait_for_all is
solving (i.e., waiting for the second DNS query in each address
resolution transaction and waiting for all peer address resolution
transactions).


> serverDestinations is state in the FwdStateData 'Job', the IP addresses
> used by it have to go through the peer selection 'Job' processing and
> related access controls applied to whatever comes out of an ipcache
> lookup 'Job'.

Agreed. serverDestinations state is shared among peer selection and
FwdState code. Today, the sharing is sequential: After peer selection
fills serverDestinations, FwdState starts using serverDestinations. The
sequential nature of this approach creates problems. Dns_wait_for_all
solves some of them.


> DNS lookups are buried 3+ 'Job' layers away on the other side of
> ipcache. All they do is feed that cache with data.

Yes, there are many layers.

The DNS code itself does not quite feed the cache (it just sends an
answer to the requester which happens to be the cache), and the cache is
not quite a layer: A fresh DNS answer is given to both the cache and the
ipcache_nbgethostbyname() callback. There might be some unwelcomed
dependencies in the code that require that fresh answer to be cached,
but they should not exist and they are unimportant when we discuss high
level concepts.


> If you are intending to alter serveDestinations operations, then please
> dont call the option dns_* . It is a tcp_outgoing_* thing.

I do not recommend using the tcp_outgoing name prefix because the
directive has nothing to do with TCP (and everything to do with DNS).
However, I am not going to fight you on this -- we usually disagree on
how things should be named, and there usually are not enough formal
arguments that can be made to support a given name. If nobody speaks up,
you will be able to rename the directive however you want. I am
certainly not a big fan of dns_wait_for_all!

Needless to say, the directive name should reflect what the feature does
from Squid admin point of view (the code changes implementing the
feature are irrelevant when it comes to naming). The proposed feature
determines whether Squid waits for certain DNS lookups when forwarding
transactions to peers.


Cheers,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Bug 3819: "fd >= 0" assertion in file_write() during reconfiguration

2016-09-20 Thread Alex Rousskov
On 09/20/2016 04:10 AM, Amos Jeffries wrote:
> On 20/09/2016 9:52 a.m., Alex Rousskov wrote:
>> The attached trunk patch fixes bug 3819 in our tests.

> +1. Thank you.

Committed with one additional fix to trunk (r14815 and r14816).

The bug report has a v3.5 patch providing the same fix:
http://bugs.squid-cache.org/show_bug.cgi?id=3819#c28


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Bug 3819: "fd >= 0" assertion in file_write() during reconfiguration

2016-09-20 Thread Alex Rousskov
On 09/20/2016 11:58 AM, Alex Rousskov wrote:
> On 09/20/2016 04:10 AM, Amos Jeffries wrote:
>> On 20/09/2016 9:52 a.m., Alex Rousskov wrote:
>>> The attached trunk patch fixes bug 3819 in our tests.
> 
>> +1. Thank you.
> 
> Committed with one additional fix to trunk (r14815 and r14816).

Correction: trunk r14849.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] OSX transparent-proxy using pfctl

2016-09-26 Thread Alex Rousskov
On 09/26/2016 12:59 PM, Shively, Gregory wrote:

> The patch calls /sbin/pfctl to get the
> redirect state information

For every intercepted connection, this patch forks Squid to start a
shell (which then starts pfctl and awk) and then blocks Squid on that
shell output, right? That feels very expensive performance-wise. I
wonder whether Squid should emit a corresponding one-time warning, to
minimize the number of folks who are going to assume that this works
well under load and then complain that their Squid is "slow".

The awk part can be eliminated by parsing in Squid, but I am not sure
that optimization is worth it when there is so much overhead in this
code besides awk.


> Let me know if I should continue down the road on getting test-builds.sh
> running on OSX.

IMO, you should not be responsible for fixing any old test-builds.sh
bugs, only for the problems your changes add or cause.


> +char *cmd = (char *)malloc(sizeof(char) * cmdLen);
> +snprintf(cmd, cmdLen, cmdFormat, daddr, saddr, established);

Please rewrite using SBuf. AFAICT, what you want can be written without
all those unsafe C things like malloc() and snprintf() -- you are simply
concatenating a few strings to form a command.


> +execl("/bin/sh", "/bin/sh", "-c", cmd, NULL);

Please propagate or otherwise handle execl() errors.


> +FILE *fp = fdopen(pipefd[0], "r");
> +while (!feof(fp)) {

Please handle fdopen() errors instead of ignoring them and feeding a nil
pointer to feof().


> +FILE *fp = fdopen(pipefd[0], "r");
...
> +close(pipefd[0]);
> +return true;
...
> +close(pipefd[0]);
> +return false;

Leaking "fp"? AFAICT, you are supposed to close "fp" instead of
pipefd[0] after fdopen() but I am not sure.


> +debugs(89, DBG_IMPORTANT, HERE << "PFCTL failed to find 
> state");
> +return false;

Leaking both "fp" and pipefd[0]?


> +char rdaddr[MAX_IPSTRLEN + 6];

AFAICT, the input line may have more characters than that because ":",
"\n", and "\0" all consume space (I assume 6 is for the ":port").

Also, this declaration is not needed outside the while loop.


> +char *portPtr = strchr(rdaddr, '\n');
> +if (portPtr) *portPtr = '\0';

Should not we reject truncated lines (without '\n') instead of hoping
that the port number or the address itself was not truncated?


> +if (errno == EINTR || errno == EAGAIN) {
> +continue;
> +}

This code has no effect AFAICT -- the loop will continue with or without
that if statement. Did you mean to break the loop on all other errors
instead?


I recommend removing "fp" and reading from pipefd[0] directly instead.
You can use Tokenizer to safely parse what you read without these
error-prone C tricks.

> + CPPFLAGS=\"-D_SQUID_APPLE -Wno-error=deprecated-declarations\" 
> LDFLAGS=-lresolv \
...
> -DISTCHECK_CONFIGURE_FLAGS=""
> +DISTCHECK_CONFIGURE_FLAGS="CPPFLAGS=\"-D_SQUID_APPLE 
> -Wno-error=deprecated-declarations\" LDFLAGS=-lresolv"

These temporary for-testing changes should not be a part of the patch.


HTH,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] OSX transparent-proxy using pfctl

2016-09-28 Thread Alex Rousskov
On 09/28/2016 12:18 PM, Shively, Gregory wrote:
> The one-time warning sounds like a good idea. Is there a place that
> you have to add the one-time message, or should I just add a static variable 
> to determine if the warning has been displayed the first time down this code
> path?

If nobody recommends a better place, then place the warning in your
code. "bzr grep WARNING src" for examples.


>> IMO, you should not be responsible for fixing any old test-builds.sh bugs,
>> only for the problems your changes add or cause.

> Thanks - I was mistaken and thought from the merge document you sent that
> I should have run the test-builds.sh. 

I did not send a merge document. You should run the test-builds.sh. I do
not know whether running test-builds.sh will expose any bugs you are
responsible for. If running test-builds.sh does not expose any bugs you
are responsible for [because of all the other bugs that you are not
responsible for], then simply say so.


>>> +char *cmd = (char *)malloc(sizeof(char) * cmdLen);
>>> +snprintf(cmd, cmdLen, cmdFormat, daddr, saddr, established);

>> Please rewrite using SBuf. AFAICT, what you want can be written without all
>> those unsafe C things like malloc() and snprintf() -- you are simply
>> concatenating a few strings to form a command.

> Does just using the appendf() method sound alright? 

Sounds bad to me. Just use SBuf::append() to concatenate strings and
characters instead of turning off many C++ protections by using printf()
or equivalent.


> And am I
> correct that the method with throw an exception that should just rollup the 
> stack?

Yes, SBuf methods should throw on overflows and similar non-logic
errors. Whether your code can handle those exceptions correctly, I have
not checked.


>>> +FILE *fp = fdopen(pipefd[0], "r");
>> ...
>>> +close(pipefd[0]);
>>> +return true;
>> ...
>>> +close(pipefd[0]);
>>> +return false;

>> Leaking "fp"? AFAICT, you are supposed to close "fp" instead of pipefd[0]
>> after fdopen() but I am not sure.


>>> +debugs(89, DBG_IMPORTANT, HERE << "PFCTL failed to find 
>>> state");
>>> +return false;

>> Leaking both "fp" and pipefd[0]?


> If my memory serves, I think the FILE * returned from fdopen() is just a 
> pointer to the 
> same datastructure in the kernel that fd would be associated to.

It is not just a pointer -- there would be no reason to add all those
f*() functions if it were! FILE is a libc object with buffers. It also
includes a pointer to the kernel structures, of course.


> I'll go ahead and change it to a fclose()

Yes, that is the right solution if you insist on using FILE.


>> I recommend removing "fp" and reading from pipefd[0] directly instead.
>> You can use Tokenizer to safely parse what you read without these error-
>> prone C tricks.


> The reason that I used fgets() instead of read() on the pipefd[0] is that I'm 
> reading either the  length of rdaddr or until I read a newline, which ever 
> comes first.

My recommendation was given with the above assumption in mind.


> If I used read(), if the length of the line was less than rdaddr len
> (and there was more than one line), it could read the beginning of the next
> Line. 

Yes, but it is not a problem if you parse correctly. Currently, you are
combining partial parsing with partial reading. Splitting them often
avoids nasty bugs that result from such combination.


> I could use Tokenizer or it looks like Just SBuf

You should use Tokenizer instead of manually parsing what looks like
trivial syntax. Parsing problems in the first patch version imply that
it is not as trivial as it may seem (or, to be more precise, parsing
trivial syntax by hand is still error-prone!).


> if I was to use just read(), I'd have to read a character at a time from the 
> fd

Why not read, say, 4096 characters at a time?


>  or if you want me to throw the input stream into a SBuf or
> Tokenizer instead of just using some C calls.

I do, but I cannot insist on that. You can stop polishing the patch as
soon as it has no known bugs. Others will decide whether to accept it.


Please note that you have not shared updated changes AFAICT. The
squid-3.5.20-osx-devpf-transparent.patch you attached looked old to me.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] [RFC] Support concurrent SBuf::c_str() calls

2016-09-29 Thread Alex Rousskov
Hello,

The current trunk code contains at least two serious bugs caused by
SBuf::c_str() misuse. Both known bugs looks similar:

> storeCreateEntry(storeUri.c_str(), storeUri.c_str(), ...);

and

> storeCreateEntry(uri.c_str(), uri.c_str(), ...);


Both use cases violate safe c_str() use conditions documented in the
long and occasionally YELLING description of that method. As you have
probably guessed by now, the c_str() method itself is not constant, so
calling c_str() twice often destroys the first c_str() result.

It is trivial to fix both violations, but it is very difficult to say
how many other similar violations are hidden in (or will be added to)
the code. I am sure we will continue to miss them during code reviews.

This particular problem is caused by the "used storage size" increment
in the SBuf::c_str() implementation:

> *rawSpace(1) = '\0'; // terminate the buffer
> ++store_->size; // protect the terminator character

The increment forces the second call to c_str() to re-allocate the
store, which usually destroys the still-in-use storage pointed to by the
first c_str() result.

If we remove the increment, then Squid will work as expected when using
two concurrent c_str() results.

With or without the increment, Squid may crash or misbehave when buggy
code appends to SBuf while a previous c_str() result is still in use:
With an increment, appending may destruct the previous c_str() buffer.
Without an increment, appending may overwrite the 0-terminator used by
the previous c_str() call.

Valgrind is able to detect the "concurrent c_str()s" bugs in the current
trunk. My Valgrind tests did not expose other related bugs when I
removed the increment, but that does not mean those other bugs do not
exist, of course.

Should we remove the increment to make concurrent c_str() calls safe?


Thank you,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] OSX transparent-proxy using pfctl

2016-09-29 Thread Alex Rousskov
On 09/29/2016 01:12 PM, Shively, Gregory wrote:

> Sometimes these mailing lists make me think like I'm talking to one
> person :-).

Glad we all sound coherent to you :-)!


> ERROR: files left in build directory after distclean:
> ./src/cf_gen.dSYM/Contents/Info.plist
> ./src/cf_gen.dSYM/Contents/Resources/DWARF/cf_gen

> I can run with the "--keep-going" flag, but that ends with a the
> equivalent error in btlayer-05-no-deps-esi. Do I run it this way and
> just grep the logs for FAIL or something?

Sounds like a good plan to me.


> I wasn't sure if I should handle it or let it flow up, since if it
> was in an overflow state I would doubt I could handle this packet,
> but maybe the next connection would be successful.

I recommend temporary adding an exception with Must(false) inside the
parsing code and testing what happens. This approach may give you enough
information to decide whether your code or the caller should be
responsible for handling parsing exceptions.


> +while ((n = read(pipefd[0], buf, sizeof(buf))) > 0) {
> +state.append(buf);

You need to tell SBuf how much to append(). read(2) does not 0-terminate
buf!


> +if (n == -1) {
> +int xerrno = errno;

Too late: errno may already be different after close().


> +// Move to EOL and try again from the next line
> +tk.token(host, lf);

If possible, use one of the skip() methods to avoid the implication that
you are parsing host here (and to save a couple of CPU cycles).

Please note that Tokenizer has a method to extract integers. I am not
sure it will help you here, but it might.

> +newConn->local = host.c_str();
> +newConn->local.port(atol(port.c_str()));
> +debugs(89, 5, HERE << "address NAT: " << newConn);
> +return true;

> +newConn->local = host.c_str();
> +newConn->local.port(atol(port.c_str()));
> +debugs(89, 5, HERE << "address NAT: " << newConn);

If you can avoid code duplication, please do so.

> +CharacterSet bracket("bracket", "[]");
> +CharacterSet colon("colon", ":");
> +CharacterSet lf("lf", "\n");
> +CharacterSet ws("ws", " \t\r\n");

These should be declared as static to avoid allocating/initializing lots
of bytes on every access and as constant to help the optimizer (and to
avoid future bugs). And you can move them inside the loop then -- they
are unused outside the loop.

In general, declare variables as late as possible and make them const if
possible.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] OSX transparent-proxy using pfctl

2016-09-29 Thread Alex Rousskov
On 09/29/2016 03:48 PM, Shively, Gregory wrote:
>>> I wasn't sure if I should handle it or let it flow up, since if it was
>>> in an overflow state I would doubt I could handle this packet, but
>>> maybe the next connection would be successful.

>> I recommend temporary adding an exception with Must(false) inside the
>> parsing code and testing what happens. This approach may give you enough
>> information to decide whether your code or the caller should be responsible
>> for handling parsing exceptions.



> If I understand correctly, the Must(false) goes all the way up the
> stack to main and exits squid. If instead it falls thru and returns
> false - the connection will fail and the next connection another
> attempt at parsing.

Must(false) just throws an exception. Where that exception will be
caught before main() when thrown from your code, I do not know (perhaps
you already do). Must(false) is just a trick to check whether all
exceptions in your code will be caught/handled the way you want. The
alternative is to trigger real exceptions, but that is often a lot harder.

In general, we should be using as few try/catch as possible to
accomplish the desirable outcome. Thus, always adding try/catch "just in
case" is the wrong approach (I am _not_ saying you are doing that!).
Until Squid exception handling is improved, testing is often the best
way to find the right place.

If you decide that a try/catch is necessary, place it as high as
possible and please do not forget to print the caught exception. There
are examples in the code.


Please note that you do not have to catch exceptions even if they kill
Squid. The current lack of try/catchers around critical contexts is not
your problem. Add try/catch only if you think catching close to your new
code is necessary for your code to work well under more-or-less "normal"
conditions. I doubt it is currently necessary because you are not using
exceptions for benign errors.


> +if (n == -1) {
> +int xerrno = errno;
> +debugs(89, DBG_IMPORTANT, HERE << "Reading from PFCTL failed: " << 
> xstrerr(xerrno));
> +return false;
> +}
> +
> +close(pipefd[0]);

Now leaking pipefd[0] on errors. Someday, Squid will have a auto-close
descriptor wrapper...


> +SBuf host;
> +int64_t port;
> +
> +if (tk.token(host, bracket) || tk.token(host, colon)) {

Nice rewrite! You can move the port declaration lower, and you should.
And no empty lines between the declaration and the single-statement code
that uses that declaration exclusively:

SBuf host;
if (...) {
   int64_t port = 0;
   if (...) ...
}


> +while (!tk.atEnd()) {
> +if (tk.token(host, bracket) || tk.token(host, colon)) {
...
> +}
> +
> +tk.skip('\n');
> +}

An infinite loop if the actual input contains, say, a single space and
nothing else? Or a token without brackets and colons? That is unlikely,
of course, but perhaps not completely impossible. I am not dictating
code, but to skip an arbitrary bad line you might need something like
this at the end of an iteration:

  // the combination covers all characters so we will make progress
  (void)tk.skipAll(anythingButLf); // silently skip any garbage
  (void)tk.skipAll(lf))
  // either at the beginning of the next line or eof


BTW, I believe the previous version of your patch had a similar infinite
loop problem, but now it became obvious, thanks to your nice refactoring.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] OSX transparent-proxy using pfctl

2016-09-30 Thread Alex Rousskov
On 09/30/2016 09:04 AM, Shively, Gregory wrote:

> How about I get rid of the loop all together

All other factors being equal, a single statement is better than a loop
with a similar statement inside.

> - I should be
> only getting one line from pfctl, and if the parsing fails -I should
> probably just be returning false anyway instead of attempting to find
> what I expect later in the buffer. I was trying to be safe from the
> parsing, but I'm starting to think I should just fail out of the
> parsing if the buffer doesn't have what I expect.

I do not know enough about pfctl and the corresponding environment to
answer your question with certainty but it looks like your awk command
should should protect Squid from any irrelevant/garbage lines. If nobody
else speaks up, this is your decision to make.


> +cmd.append("\" && $8 == \"");
> +cmd.append("ESTABLISHED:ESTABLISHED\" {print $5}'");

These should probably be merged into one. If no other changes are
needed, this merge can be done during commit.

> +int xerrno = errno;

> +pid_t pid = fork();

> +int xerrno = errno;

> +int xerrno = errno;

All should be "const". If no other changes are needed, these "const"s
can be added during commit.


> +debugs(89, 5, HERE << "address NAT: " << newConn);

No HERE in trunk, but it can be removed during commit.

I also recommend showing "state" contents in case you are returning
false because the PFCTL output did not match your expectations.
Something like:

debugs(89, 3, "no address in" <<
   Raw("PFCTL output", state.rawContent(), state.size());
return false;

That will cover the important case of no/empty output as well.

> +enter_suid();
> +execl("/bin/sh", "/bin/sh", "-c", cmd.rawContent(), NULL);
> +leave_suid();
> +exit(EXIT_FAILURE);

I wonder whether we should [save and] return the errno here instead of
generic EXIT_FAILURE?


> A different question - I'm wondering if I should move where this
> conditional compilation is being placed with regard to the other
> conditional compilation. I had put it in based on what the last
> compilation options I had used when attempting to get this working
> before looking in the code. I didn't know if this would be enabled by
> configure or some other macro that might be defined on OSX. But it
> looks like I have it inside the conditional compilation for
> !USE_NAT_DEVPF. Do you think it should be moved?

Sorry, I do not know. Amos may know the answer to this question.


> Regarding the test_builds.sh - got it to run with the --keep-going
> and grepped out of the logs - only PASS in the summary section. This
> was with the current patch. I'll run it again, but at least know that
> it is working.

Excellent.

I have no further comments about the code. Thanks a lot for polishing it!

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] polish Config2 using C++11 features

2016-09-30 Thread Alex Rousskov
On 09/30/2016 09:38 AM, Amos Jeffries wrote:
> We now seem to have had several patches successfully use members
> declared with default values and/or with the "*this = Foo();" shortcut
> for a reset/clear method.

The *this assignment works for pre-C++11 v3.5 as well. The default
values will not work there so we should be extra careful when
backporting affected code.


> So I think we can start using these to replace old C-style
> initialization and clear() functions.

Agreed. I wonder whether we should be consistent -- if some class data
members cannot use default initialization and some can, should we
initialize _all_ members in the constructor(s)?


> This patch begins by replacing the Config2 use of memset(). I for one am
> constantly mistaking which of the Config objects has memset() applied to
> it at the global level when reconfigure happens. Now we do not need to
> care, each object handles its own clearing one way or another.

+1

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] Support concurrent SBuf::c_str() calls

2016-09-30 Thread Alex Rousskov
On 09/29/2016 09:19 PM, Amos Jeffries wrote:
> On 30/09/2016 5:03 a.m., Alex Rousskov wrote:
>> Should we remove the increment to make concurrent c_str() calls safe?

> The reason it exists remember is to prevent other SBuf sharing that
> storage MemBuf from thinking they can append to the storage oer top of
> the terminator. 

Yes, I know, but that increment not only prevents problems but _causes_
them as well: The increment may force the SBuf to realloc, which may
result in deallocation of the original storage still pointed to by the
earlier c_str() result.


> Given that we have no easy knowledge of how the c_str()
> is going to be used by the recipient code we cannot guarantee lack of
> terminator is safe.

Yes, with or without the increment, Squid may crash or misbehave when
buggy code appends to SBuf while a previous c_str() result is still in
use. The lack of increment is not safe. The presence of the increment is
not safe. The problems are different, but they exist with or without the
increment.


> Time to push SBuf usage forward and aggressively remove the need for
> c_str()?

The need for c_str() will never go away because libraries outside of
Squid control need c-strings. And, due to the lack of proper QA, any
"aggressive push" affecting lots of Squid code is a recipe for a
disaster IMO.

The question is which c_str() implementation is safer? The one that may
crash Squid when called twice for the same SBuf in the same expression
or the one that may crash Squid when appending to the same SBuf in the
same expression? There are already two known cases of the former. There
are currently no known cases of the latter.


Overall, I know of three primary ways to implement c_str():

1. Always 0-terminate SBuf-used storage. Safest implementation possible,
but has serious negative performance overheads, especially for tokenizing.

2. Terminate the used storage at the time of c_str().

2a. Increment the used storage size. Current implementation.
Leads to problems discussed in this thread.

2b. Do not increment the used storage size.
Leads to other problems discussed in this thread.

3. Allocate and store a new c-string as needed, copying from the
storage. Safest implementation possible but has serious negative
performance overheads. These overheads affect different use cases than
those of #1.

Needless to say, we can add clever code to reduce risks associated with
some implementations. For example (these are just rough sketches/ideas):

a) Appending code in #2b can check whether it is about to overwrite the
0-terminator added by a previous c_str() call and, if yes, preserve the
old storage (we can preserve one such storage or maintain a global FIFO
queue).

b) C-string allocating code in #4 can maintain a global FIFO queue of
previously allocated but now supposedly unused c-strings to minimize the
risk of de-allocating a still used c-string.

c) C-string allocating code in #4 can delay allocation until it might be
needed, similar to how (a) decides that the 0-terminator is in danger.


Going forward, do you think we should:

i. Keep using #2a and rewrite the known double-c_str() code.
ii. Make double-c_str() code safe (i.e., switch to #2b).
iii. Add code to make the existing #2 implementation safer.
. Switch to a completely different implementation like #1 or #4.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] polish Config2 using C++11 features

2016-09-30 Thread Alex Rousskov
On 09/30/2016 10:58 AM, Amos Jeffries wrote:
> On 1/10/2016 5:09 a.m., Alex Rousskov wrote:
>> On 09/30/2016 09:38 AM, Amos Jeffries wrote:
>>> So I think we can start using these to replace old C-style
>>> initialization and clear() functions.

>> Agreed. I wonder whether we should be consistent -- if some class data
>> members cannot use default initialization and some can, should we
>> initialize _all_ members in the constructor(s)?

> I was aiming towards that some years ago. But reality turned out to be
> that some of the bigger classes had annoyingly long lists of members to
> initialize, many of which did not exactly need it. And no compiler or
> static checking detects patches that forget to set one if it was a
> complex class with own default constructor (like our Pointer types,
> which were frequently missed).

Sorry I was not clear. I was not proposing a one-size-fits-all approach
that you were trying to enforce some years ago. The consistency I was
talking about has a single class scope. In other words:

* If, for a specific class C, some C members cannot use default
initializers, then none of the C members should use default initializers.

Is that a good rule? Do we need a rule here?


> So these days I'm only setting the basic POD type members explicitly on
> construct. If we can set them like this and skip defining a default
> constructor entirely that would be even better for shrinking the code.

Sure, but the consistency I am worried about is about a different case
where we cannot avoid defining a constructor because some members
require non-constexpr initializers.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] OSX transparent-proxy using pfctl

2016-09-30 Thread Alex Rousskov
On 09/30/2016 10:37 AM, Amos Jeffries wrote:
> Please make sure that your code debugs() dumps the full pfctl line(s)
> received at level DBG_DATA, and (only) on errors the relevant bit at a
> higher level like 2 or 3 - the other functions debug output can give

This approach is outdated because Raw() takes care of DBG_DATA. The best
general approach today is to use an appropriate non-DATA level for
debugs() while using Raw() for printing data inside those debugs():

  debugs(11, DBG_IMPORTANT, "always printed" << Raw(may be printed...));

By default, small strings are always printed while large strings are
only printed if DBG_DATA debugging level is enabled. The behavior can be
tuned by calling .minLevel() on the Raw object.


For example, given "large" input,

  debugs(89, 5, "successfully parsed" << Raw("pfctl input", ...));

will produce something like

  ... successfully parsed pfctl input

with ALL,5 but something like

  ... successfully parsed pfctl input[150]=large-input-here

with ALL,9.


HTH,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] cleanup removal of needless get()

2016-10-01 Thread Alex Rousskov
On 09/30/2016 11:05 PM, Amos Jeffries wrote:
> The SSL code in particular contains a lot of calls to the get() on
> various Pointer objects from the general code.
> 
> Now that C++11 gives our Pointer better boolean operators, and
> dereference oerators have been added. A bunch of these calls are not
> needing to be explicit. Remove them and use the available class
> operators for more easily read code.
> 
> NOTE that this does not include the get() to produce raw-pointers for
> library API calls.
> 
> Also, this patch does not seek to be all-inclusive sweepign change. It
> was built using a grep for if-statements using get() output as their
> sole boolean condition and polishing the code surrounding. So there are
> likely many other uses that can be removed in other patches. This is
> just incremental polish.

+1 if you post a simple script that will make these kind of changes in
an arbitrary Squid branch. I have not checked each change but I trust
that you have done that already.

-0 otherwise. We should not be using .get() when it is not needed, but,
without automation, the post-change merging expenses outweigh the
advantages of this partial sweeping change.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] polish Config2 using C++11 features

2016-10-01 Thread Alex Rousskov
On 09/30/2016 10:55 PM, Amos Jeffries wrote:
> On 1/10/2016 6:10 a.m., Alex Rousskov wrote:
>> On 09/30/2016 10:58 AM, Amos Jeffries wrote:
>>> On 1/10/2016 5:09 a.m., Alex Rousskov wrote:
>>>> On 09/30/2016 09:38 AM, Amos Jeffries wrote:
>>>>> So I think we can start using these to replace old C-style
>>>>> initialization and clear() functions.
>>
>>>> Agreed. I wonder whether we should be consistent -- if some class data
>>>> members cannot use default initialization and some can, should we
>>>> initialize _all_ members in the constructor(s)?
>>
>>> I was aiming towards that some years ago. But reality turned out to be
>>> that some of the bigger classes had annoyingly long lists of members to
>>> initialize, many of which did not exactly need it. And no compiler or
>>> static checking detects patches that forget to set one if it was a
>>> complex class with own default constructor (like our Pointer types,
>>> which were frequently missed).
>>
>> Sorry I was not clear. I was not proposing a one-size-fits-all approach
>> that you were trying to enforce some years ago. The consistency I was
>> talking about has a single class scope. In other words:
>>
>> * If, for a specific class C, some C members cannot use default
>> initializers, then none of the C members should use default initializers.
>>
>> Is that a good rule? Do we need a rule here?
> 
> What we have now is to do the work to cleanly and properly avoid
> uninitialized members on the build testing. But to treat other members
> which handle themselves as optionally initialized.
> 
> That is sufficient, automatically testable (-Wuninitialized), and can be
> applied as a blanket rule instead of trying to remember where its used
> and where not.
> 
> If one wishes to go around adding needless entries to constructor lists,
> one is free to do so. But it is really pointless work and wasted lines
> of code.
> 
>>
>>> So these days I'm only setting the basic POD type members explicitly on
>>> construct. If we can set them like this and skip defining a default
>>> constructor entirely that would be even better for shrinking the code.
>>
>> Sure, but the consistency I am worried about is about a different case
>> where we cannot avoid defining a constructor because some members
>> require non-constexpr initializers.
> 
> In that case it is optional whether the code explicitly says what the
> compiler will do anyway.
> 
> We might be pushed into that position by "Big 3" or "Big 5"
> requirements. But as optional we can deal with all them case-by-case.

Unfortunately, I see no correlation between my question and your
response. Dropping the topic as not important enough to continue this
discussion. Thus, no new rules [for the new/changed code] for now.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] Support concurrent SBuf::c_str() calls

2016-10-02 Thread Alex Rousskov
On 10/02/2016 03:25 PM, Kinkie wrote:
> On Fri, Sep 30, 2016 at 6:03 PM, Alex Rousskov
>  wrote:
>> On 09/29/2016 09:19 PM, Amos Jeffries wrote:
>>> On 30/09/2016 5:03 a.m., Alex Rousskov wrote:
>>>> Should we remove the increment to make concurrent c_str() calls safe?
>>
>>> The reason it exists remember is to prevent other SBuf sharing that
>>> storage MemBuf from thinking they can append to the storage oer top of
>>> the terminator.
>>
>> Yes, I know, but that increment not only prevents problems but _causes_
>> them as well: The increment may force the SBuf to realloc, which may
>> result in deallocation of the original storage still pointed to by the
>> earlier c_str() result.
>>
>>
>>> Given that we have no easy knowledge of how the c_str()
>>> is going to be used by the recipient code we cannot guarantee lack of
>>> terminator is safe.
>>
>> Yes, with or without the increment, Squid may crash or misbehave when
>> buggy code appends to SBuf while a previous c_str() result is still in
>> use. The lack of increment is not safe. The presence of the increment is
>> not safe. The problems are different, but they exist with or without the
>> increment.
>>
>>
>>> Time to push SBuf usage forward and aggressively remove the need for
>>> c_str()?
>>
>> The need for c_str() will never go away because libraries outside of
>> Squid control need c-strings. And, due to the lack of proper QA, any
>> "aggressive push" affecting lots of Squid code is a recipe for a
>> disaster IMO.
>>
>> The question is which c_str() implementation is safer? The one that may
>> crash Squid when called twice for the same SBuf in the same expression
>> or the one that may crash Squid when appending to the same SBuf in the
>> same expression? There are already two known cases of the former. There
>> are currently no known cases of the latter.
>>
>>
>> Overall, I know of three primary ways to implement c_str():
>>
>> 1. Always 0-terminate SBuf-used storage. Safest implementation possible,
>> but has serious negative performance overheads, especially for tokenizing.
> 
> If this is the decision, might as well drop sbuf altogether in favor
> of std::string (or possibly std::string + custom mempools-friendly
> Allocator).

You are probably right, although std::string may have other funny things
going that we might want to avoid/control.


>> 2. Terminate the used storage at the time of c_str().
>>
>> 2a. Increment the used storage size. Current implementation.
>> Leads to problems discussed in this thread.
>>
>> 2b. Do not increment the used storage size.
>> Leads to other problems discussed in this thread.
>>
>> 3. Allocate and store a new c-string as needed, copying from the
>> storage. Safest implementation possible but has serious negative
>> performance overheads. These overheads affect different use cases than
>> those of #1.

> If we are aware of the underlying issue, there's no reason not to save
> the temporary to a naked char* and use that, as long as the SBuf is
> not touched it'll work fine.

Save to a temporary variable outside SBuf? That is not a solution
because we are trying to address the cases where the developer forgot to
do that or could not really tell that doing that is necessary. Anything
from trivial/direct

  foo(buf.c_str(), buf.c_str());

to complex/hidden

  foo(buf, buf.c_str()) which calls bar(buf1.c_str(), cstr2);



>> Needless to say, we can add clever code to reduce risks associated with
>> some implementations. For example (these are just rough sketches/ideas):
>>
>> a) Appending code in #2b can check whether it is about to overwrite the
>> 0-terminator added by a previous c_str() call and, if yes, preserve the
>> old storage (we can preserve one such storage or maintain a global FIFO
>> queue).
>>
>> b) C-string allocating code in #4 can maintain a global FIFO queue of
>> previously allocated but now supposedly unused c-strings to minimize the
>> risk of de-allocating a still used c-string.
>>
>> c) C-string allocating code in #4 can delay allocation until it might be
>> needed, similar to how (a) decides that the 0-terminator is in danger.
>>
>>
>> Going forward, do you think we should:
>>
>> i. Keep using #2a and rewrite the known double-c_str() code.
>> ii. Make double-c_str() code safe (i.e., switch to #2b).
>> iii. Add code to make the existing #2 implementation safer.
>> . Switch to a completely different implementation like #1 or #

Re: [squid-dev] [PATCH] convert DNS shutdown to runner

2016-10-02 Thread Alex Rousskov
On 10/02/2016 06:35 AM, Amos Jeffries wrote:
> This patch adds a Runner to manage the DNS component state on shutdown
> (and the matching reconfigure port closures). The actions taken on
> reconfigure and shutdown are not changed, just their timing.

> +RunRegisteredHere(RegisteredRunner::startReconfigure);
> +
>  // Initiate asynchronous closing sequence
>  serverConnectionsClose();
>  icpClosePorts();
>  #if USE_HTCP
>  htcpClosePorts();
>  #endif
> -Dns::Shutdown();


Have you checked that serverConnectionsClose(), icpClosePorts(), and
htcpClosePorts() (and the processing they trigger) can live without DNS?
I am not saying they cannot. I am just asking whether we are just hoping
that they can or reasonably certain that they can. Either way, please
reflect the answer in the commit message.


> Visible differences are now that Squid logs when DNS ports are being
> closed (and the reason)

> +debugs(78, DBG_IMPORTANT, reason << ": Closing DNS sockets");

IMHO, there is nothing important about it. Reconfiguration and shutting
down is already logged. The [changing] details of that long process are
not important [to admins]. I suggest changing the debug level to 2, but
I cannot not insist on it -- cache.log is already filled with noise.


> +debugs(78, DBG_IMPORTANT, reason << ": Closing DNS sockets");

Same here.


> +/// Meant for halting any active modules that could be triggered by 
> external
> +/// events (such as listening ports) while changing their configuration.

I suggest something a lot more general, like "Meant for modules that
need to prepare for their configuration being changed [outside their
control]. The changes end with the syncConfig() event."

All of the above can be adjusted during commit IMO.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] Support concurrent SBuf::c_str() calls

2016-10-03 Thread Alex Rousskov
On 10/02/2016 11:51 PM, Amos Jeffries wrote:
> On 3/10/2016 1:03 p.m., Alex Rousskov wrote:
>> On 10/02/2016 03:25 PM, Kinkie wrote:
>>> On Fri, Sep 30, 2016 at 6:03 PM, Alex Rousskov
>>>> Overall, I know of three primary ways to implement c_str():
>>>>
>>>> 1. Always 0-terminate SBuf-used storage. Safest implementation possible,
>>>> but has serious negative performance overheads, especially for tokenizing.

>>> If this is the decision, might as well drop sbuf altogether in favor
>>> of std::string (or possibly std::string + custom mempools-friendly
>>> Allocator).

>> You are probably right, although std::string may have other funny things
>> going that we might want to avoid/control.

> Yes. [...]

I am dropping this part of the discussion until somebody proposes to do
#1. The discussion contains wrong statements IMO but I will ignore them
in hope that they remain irrelevant (because nobody will do #1).



>>>> 2. Terminate the used storage at the time of c_str().
>>>>
>>>> 2a. Increment the used storage size. Current implementation.
>>>> Leads to problems discussed in this thread.
>>>>
>>>> 2b. Do not increment the used storage size.
>>>> Leads to other problems discussed in this thread.
>>>>
>>>> 3. Allocate and store a new c-string as needed, copying from the
>>>> storage. Safest implementation possible but has serious negative
>>>> performance overheads. These overheads affect different use cases than
>>>> those of #1.

>>> If we are aware of the underlying issue, there's no reason not to save
>>> the temporary to a naked char* and use that, as long as the SBuf is
>>> not touched it'll work fine.

> I imagined a cached const char* pointer inside SBuf that c_str()
> produces if it is unset. The cow() operations freeing that cached pointer.

A pointer to what? A pointer to buf() is approach #2. A pointer to
specially allocated storage is approach #3.

>>>> Going forward, do you think we should:
>>>>
>>>> i. Keep using #2a and rewrite the known double-c_str() code.
> 
> That is two options. I think we should fix the double-c_str() code
> regardless of any other changes.

Implementations #1, #2b, and #3 make double-c_str() code perfectly safe.
We should not rewrite it.


> These are definitely performance bugs
> with the 2nd call triggeringd reallocate regardless of other bad side
> effects.

Implementations #1, #2b, and #3 do not result in poor double-c_str()
performance. They make double-c_str() performance very close to a single
c_str() performance.


>>>> ii. Make double-c_str() code safe (i.e., switch to #2b).
>>>> iii. Add code to make the existing #2 implementation safer.
>>>> . Switch to a completely different implementation like #1 or #4.

> There are very few things we can do will resolve this in such a short
> timeframe.

Agreed.


> Any choice we make to alter the design will take a long time to
> check every single code path works okay.

Not really -- if the API stays the same or becomes even safer, then the
impossible task of checking every single code path becomes unnecessary.


> Another option is:
> 
> vi. drop c_str() completely and use SBufToCstring() or SBufToString() as
> ways to obtain char*.

How will the proposed SBufToString() differ from the existing "c_str"?
If you are proposing that SBufToString() will allocate memory that the
calling code will have to clean up explicitly, then I am against that
idea because it makes the API worse, especially in the presence of
exceptions.


> Adjusting callers as necessary for those not to leak memory or be too
> bad for performance will be easier than a full code audit.

I do not think any of the other options require a "full code audit"
either so this is not a valid comparison point.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] Bug 4527: Missing MemObject::storeId value

2016-10-05 Thread Alex Rousskov
On 10/05/2016 11:17 AM, Amos Jeffries wrote:

> ideally also a fix for the missing storeId() bug that has shown up in
> 3.5 - if there is anything like a solution in the works for that. I
> expect it was the Last-Modified backports.

Do you expect it was v3.5 r14090 (the Last-Modified backport)? All the
reports about Bug 4527 appear to predate that branch revision AFAICT.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] [PATCH] Faster SBuf::append

2016-10-06 Thread Alex Rousskov
Hello,

The attached optimization patch was inspired be reviewing the
following code:

> Parser::parse(const SBuf &aBuf)
...
> if (preservedData_.isEmpty())
> preservedData_ = aBuf; // avoid needless memory allocation
> else
> preservedData_.append(aBuf);

Supporting this kind of optimization automatically was a little trickier
than I thought, but I think the attached patch does that. After the
patch, the above code will collapse to a single append call:

  preservedData_.append(aBuf);


HTH,

Alex.
P.S. AFAICT, there is no guarantee that SBuf using GetStorePrototype()
isEmpty so checking both conditions seems necessary to me.
Optimize appending to empty SBufs, a common use case.

Most appends start with an empty buffer. Some append only once. There
is no reason to allocate new memory for that first append or force the
appender to write code to optimize that first append.

Why also check that SBuf has never been configured or altered then? To
preserve pre-allocated SBufs, such as those [ab]used as I/O buffers.

=== modified file 'src/sbuf/SBuf.cc'
--- src/sbuf/SBuf.cc	2016-06-14 16:54:23 +
+++ src/sbuf/SBuf.cc	2016-10-06 15:29:26 +
@@ -170,40 +170,43 @@ SBuf::rawSpace(size_type minSpace)
 void
 SBuf::clear()
 {
 #if 0
 //enabling this code path, the store will be freed and reinitialized
 store_ = GetStorePrototype(); //uncomment to actually free storage upon clear()
 #else
 //enabling this code path, we try to release the store without deallocating it.
 // will be lazily reallocated if needed.
 if (store_->LockCount() == 1)
 store_->clear();
 #endif
 len_ = 0;
 off_ = 0;
 ++stats.clear;
 }
 
 SBuf&
 SBuf::append(const SBuf &S)
 {
+if (isEmpty() && store_ == GetStorePrototype())
+return (*this = S); // optimization: avoid needless copying
+
 const Locker blobKeeper(this, S.buf());
 return lowAppend(S.buf(), S.length());
 }
 
 SBuf &
 SBuf::append(const char * S, size_type Ssize)
 {
 const Locker blobKeeper(this, S);
 if (S == NULL)
 return *this;
 if (Ssize == SBuf::npos)
 Ssize = strlen(S);
 debugs(24, 7, "from c-string to id " << id);
 // coverity[access_dbuff_in_call]
 return lowAppend(S, Ssize);
 }
 
 SBuf &
 SBuf::append(const char c)
 {
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Faster SBuf::append

2016-10-06 Thread Alex Rousskov
On 10/06/2016 10:57 AM, Amos Jeffries wrote:

> Please add a check to the unit test testSBuf::testAppendSBuf()
> to guarantee that the (*this = S) assignment code path updates the store
> reference count rather than doing a bit-wise copy of the SBuf.

I support that addition but do not have the time to implement it right
now. It would also be nice to add a test case that the optimized
assignment path does _not_ happen for pre-allocated SBufs.


>> P.S. AFAICT, there is no guarantee that SBuf using GetStorePrototype()
>> isEmpty so checking both conditions seems necessary to me.

> That sounds like a bug to me. If the initial prototype is anything but
> empty then the resulting new/clear'ed SBuf might end up with some random
> data prefix. Followup patch?

Not a bug AFAICT: The initial prototype storage might get dirty, but the
freshly initialized SBuf will be _empty_, making any storage bytes
irrelevant.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] [PATCH] Fixed v3.5 bare eCAP build

2016-10-14 Thread Alex Rousskov
Hi Amos,

The attached patch is for v3.5. Trunk has the same fix in r14884.

TODO: Adjust automated build tests to test bare eCAP configuration
(i.e., eCAP without ICAP).


HTH,

Alex.
Fixed v3.5 r14082 build with eCAP but without ICAP support.

That is, when ./configured with --enable-ecap --disable-icap-client.

AccessLogEntry::icap requires ICAP_CLIENT, not just USE_ADAPTATION.

=== modified file 'src/format/Format.cc'
--- src/format/Format.cc	2016-09-16 11:53:28 +
+++ src/format/Format.cc	2016-10-15 00:14:10 +
@@ -301,54 +301,54 @@ log_quoted_string(const char *str, char
 ++p;
 *p = *str;
 ++p;
 ++str;
 break;
 }
 }
 
 *p = '\0';
 }
 
 /// XXX: Misnamed. TODO: Split reply;
-#if USE_ADAPTATION
+#if ICAP_CLIENT
 // al->icap.reqMethod is methodNone in access.log context
 if (!msg && al->icap.reqMethod == Adaptation::methodReqmod)
 msg = al->adapted_request;
 #endif
 return msg;
 }
 
 /// XXX: Misnamed. See actualReplyHeader().
 /// \return HttpRequest or HttpReply for %http::>h.
 static const HttpMsg *
 actualRequestHeader(const AccessLogEntry::Pointer &al)
 {
-#if USE_ADAPTATION
+#if ICAP_CLIENT
 // al->icap.reqMethod is methodNone in access.log context
 if (al->icap.reqMethod == Adaptation::methodRespmod) {
 // XXX: for now AccessLogEntry lacks virgin response headers
 return NULL;
 }
 #endif
 return al->request;
 }
 
 void
 Format::Format::assemble(MemBuf &mb, const AccessLogEntry::Pointer &al, int logSequenceNumber) const
 {
 char tmp[1024];
 String sb;
 
 for (Token *fmt = format; fmt != NULL; fmt = fmt->next) {   /* for each token */
 const char *out = NULL;
 int quote = 0;
 long int outint = 0;
 int doint = 0;
@@ -802,65 +802,65 @@ Format::Format::assemble(MemBuf &mb, con
 if (al->adapted_request)
 sb = al->adapted_request->header.getByNameListMember(fmt->data.header.header, fmt->data.header.element, fmt->data.header.separator);
 
 out = sb.termedBuf();
 
 quote = 1;
 
 break;
 
 case LFT_REPLY_HEADER_ELEM: {
 if (const HttpMsg *msg = actualReplyHeader(al))
 sb = msg->header.getByNameListMember(fmt->data.header.header, fmt->data.header.element, fmt->data.header.separator);
 
 out = sb.termedBuf();
 
 quote = 1;
 }
 break;
 
 case LFT_REQUEST_ALL_HEADERS:
-#if USE_ADAPTATION
+#if ICAP_CLIENT
 if (al->icap.reqMethod == Adaptation::methodRespmod) {
 // XXX: since AccessLogEntry::Headers lacks virgin response
 // headers, do nothing for now
 out = NULL;
 } else
 #endif
 {
 out = al->headers.request;
 }
 
 quote = 1;
 
 break;
 
 case LFT_ADAPTED_REQUEST_ALL_HEADERS:
 out = al->headers.adapted_request;
 
 quote = 1;
 
 break;
 
 case LFT_REPLY_ALL_HEADERS:
 out = al->headers.reply;
-#if USE_ADAPTATION
+#if ICAP_CLIENT
 if (!out && al->icap.reqMethod == Adaptation::methodReqmod)
 out = al->headers.adapted_request;
 #endif
 
 quote = 1;
 
 break;
 
 case LFT_USER_NAME:
 #if USE_AUTH
 if (al->request && al->request->auth_user_request != NULL)
 out = strOrNull(al->request->auth_user_request->username());
 #endif
 if (!out && al->request && al->request->extacl_user.size()) {
 if (const char *t = al->request->extacl_user.termedBuf())
 out = t;
 }
 
 if (!out)
 out = strOrNull(al->cache.extuser);

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] bzr -> git?

2016-10-16 Thread Alex Rousskov
On 10/16/2016 06:36 PM, Kinkie wrote:

>   I'm currently trying to use recent advancements in Jenkins to
> improve our QA via gated commits to trunk.
>   This raises (again) the issue of bazaar versus git. Remaining on
> bazaar is getting more and more painful, as tools such as jenkins
> focus on git. Asm an example, jenkins pipelines only support git.
>  This topic was already touched in the past, and IIRC the output
> was "we'll do it, but not now". Can we reevaluate when "now" might be?

IMHO, if the git migration is done right, it can be done now.

However, doing it right most likely requires a lot more than posting a
couple of mailing list messages and running a conversion script. AFAIK,
nobody who can lead this has the cycles to do it now. This is one of the
reasons I have proposed paying a QA Engineer to do it well (it could
become a part of Pilot-1 that you are reviewing on board@).


Cheers,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Support tunneling of bumped non-HTTP traffic. Other SslBump fixes.

2016-10-17 Thread Alex Rousskov
On 10/17/2016 01:57 AM, Christos Tsantilas wrote:
> On 10/14/2016 02:30 PM, Marcus Kool wrote:
>> Squid sends the following line to the URL rewriter:
>> (unknown)://173.194.76.188:443 / - NONE

> Squid generates internally request to serve the non-HTTP client request,
> and this is what you are seeing as "(unknown)://173.194.76.188:443".

How about sending a CONNECT-like "173.194.76.188:443" URI instead of a
malformed one? That is, using option #3 below:

1. Current syntactically malformed URI: (unknown)://host:port"

2. Lying about the protocol/scheme: http://host:port/

3. Authority form URI, as in HTTP CONNECT: host:port

4. Using made-up URI scheme: tcp://host:port/
   See http://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml

HTH,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Support tunneling of bumped non-HTTP traffic. Other SslBump fixes.

2016-10-17 Thread Alex Rousskov
On 10/17/2016 10:56 PM, Amos Jeffries wrote:
> On 18/10/2016 7:54 a.m., Christos Tsantilas wrote:
>> On 10/17/2016 05:42 PM, Alex Rousskov wrote:
>>> On 10/17/2016 01:57 AM, Christos Tsantilas wrote:
>>>> On 10/14/2016 02:30 PM, Marcus Kool wrote:
>>>>> Squid sends the following line to the URL rewriter:
>>>>> (unknown)://173.194.76.188:443 / - NONE
>>>
>>>> Squid generates internally request to serve the non-HTTP client request,
>>>> and this is what you are seeing as "(unknown)://173.194.76.188:443".
>>>
>>> How about sending a CONNECT-like "173.194.76.188:443" URI instead of a
>>> malformed one? That is, using option #3 below:
>>>
>>> 1. Current syntactically malformed URI: (unknown)://host:port"
>>>
>>> 2. Lying about the protocol/scheme: http://host:port/
>>>
>>> 3. Authority form URI, as in HTTP CONNECT: host:port
>>>
>>> 4. Using made-up URI scheme: tcp://host:port/
>>>See http://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml
>>
>>
>> We can use on of the 3 or 4. We can just define a new proto for this
>> case eg a PROTO_TCP or PROTO_TUNNEL and define a Uri::Scheme for this.
>>
>> Personally I like the tcp://host:port. But I do not have actually a
>> strong opinion. Looks that we should also take in account squid helpers
>> (and maybe other external tools like log analysers).
>>
>> Opinions?
> 
> 
> If TLS and provides ALPN stating unsupported protocol, use that as the
> UriScheme image with PROTOCOL_OTHER,
> 
> else if Tunnel-Protocol from clients CONNECT, same using that value ...,
> 
> Otherwise the CONNECT with authority-URI Alex mentioned is correct for
> both helpers and logs.
> 
> Basically, the most accurate true information - no lies or guesses.

I agree in principle, but recommend doing just #3 for now, in this
patch: Everything else mentioned by Amos sounds good, and we can
probably add more scheme/protocol information sources as well, but
accurate extraction and reporting of unsupported protocol schemes is a
separate (and clearly non-trivial!) issue.

Let's fix the much bigger known problems first, which is what the
proposed patch attempts to do. I hope that option #3 allows us to do
that without causing a regression that Marcus have identified and
without creating extra work compared to the final state where more
schemes may be reported as outlined by Amos (and option #4).


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Squid 3.5.23: crash in Comm::DoSelect

2016-10-18 Thread Alex Rousskov
On 10/18/2016 03:44 AM, oleg gv wrote:

> nfds=284, so loop ends on 283 and pfds[283] is buggy

> I/o module is  src/comm/ModPoll.cc, method Comm::DoSelect(int msec)
> On stack we see that pfds[SQUID_MAXFD=256], so is less than nfds in loop.
> May be malloc nfds?

If your maxfd is bigger than SQUID_MAXFD than the bug is elsewhere and
dynamically allocating pfds is not the right fix (even though it will
"work").

I suspect your Squid is creating creating or accepting a descriptor that
exceeds SQUID_MAXFD-1. Biggest_FD+1 cannot be allowed to exceed the
misnamed SQUID_MAXFD limit.

This combination looks like a big red flag to me:

struct pollfd pfds[SQUID_MAXFD];
...
maxfd = Biggest_FD + 1;
for (int i = 0; i < maxfd; ++i) {
...
pfds[nfds].fd = i;

That code is missing assert(maxfd <= SQUID_MAXFD) which will fail in
your case.

If you want a workaround, try building Squid with a reasonable number of
maximum descriptors (e.g., 16K, 32K, or 64K). If that number is never
reached in your environment, the code will appear to work.

If you want to try a quick fix, replace SQUID_MAXFD with (Biggest_FD +
1) when declaring pfds. You may need to ignore/disable compiler warnings
about C++ arrays with dynamic sizes. Alternatively, you can allocate
pfds dynamically (as you suggested).

If you want to fix the bug, audit all Biggest_FD- and
SQUID_MAXFD-related code to make sure the two are always in sync.


HTH,

Alex.



> 2016-10-18 8:29 GMT+03:00 Amos Jeffries:
> 
> FYI: Squid-3.5.23 does not exist yet. What is the output of "squid -v" ?
> 
> On 18/10/2016 5:01 a.m., oleg gv wrote:
> > I have big traffic (at least 100 computers) , and squid often crashed in
> > Comm::DoSelect(int msec) function.
> > I have interception mode and NAT redirect.
> >
> > In coredump I saw then bug is in next fragment of code:
> >
> > 446│ for (size_t loopIndex = 0; loopIndex < nfds; ++loopIndex) {
> > 447│ fde *F;
> > 448│ int revents = pfds[loopIndex].revents;
> > 449│ fd = pfds[loopIndex].fd;
> > 450│
> > 451│ if (fd == -1)
> > 452│ continue;
> > 453│
> > 454├>if (fd_table[fd].flags.read_pending)
> > 455│ revents |= POLLIN;
> >
> > SIGSEGV occured often (about 1 time in a minute) in line 454 : 
> fd=-66012128
> > , loopindex=283
> >
> > (gdb) p pfds[282]
> > $17 = {fd = 291, events = 64, revents = 0}   -- looks ok
> >
> > (gdb) p pfds[283]
> > $18 = {fd = -66012128, events = 32595, revents = 0}  -- looks strange 
> and
> > spoiled
> >
> > (gdb) p Biggest_FD
> > $19 = 292
> >
> 
> What is the nfds value ?
> 
> It looks to me like only 282 FD have operations to perform on this I/O
> cycle.
> 
> What I/O module is being used?
> 
>  src/comm/ModDevPoll.cc:Comm::DoSelect(int msec)
>  src/comm/ModPoll.cc:Comm::DoSelect(int msec)
> 
> 
> Amos
> 
> ___
> squid-dev mailing list
> squid-dev@lists.squid-cache.org 
> http://lists.squid-cache.org/listinfo/squid-dev
> 
> 
> 
> 
> 
> ___
> squid-dev mailing list
> squid-dev@lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-dev
> 

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Template methods inside normal classes

2016-10-18 Thread Alex Rousskov
On 10/18/2016 09:52 AM, Christos Tsantilas wrote:

> Is it valid to use template methods inside normal classes for squid?

Yes (until we learn that some compilers choke on them). IIRC, I even had
some patches that use them but perhaps they have not been posted or
committed yet. Needless to say, their use must be justified (but that is
true for virtually any code).


> I know they are working, I am just asking if it is acceptable by squid
> policy.

Everything useful that is not prohibited is allowed :-).

When the patch with member templates is posted for review, the reviewers
will evaluate your decision to use them. There is no need to secure
their acceptance in advance by hand waiving about hypothetical needs IMO.


HTH,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Support tunneling of bumped non-HTTP traffic. Other SslBump fixes.

2016-10-19 Thread Alex Rousskov
On 10/19/2016 08:49 AM, Christos Tsantilas wrote:
> I am attaching a new patch.

I would like to discuss two issues:


* Logging of scheme-less URLs

> This is defines a new proto the PROTO_TCP, and for this prints the url
> in the form host:port.

The PROTO_TCP name sounds bad because we may want to log
tcp://host:port/ URLs in the future as discussed earlier.

PROTO_NONE and PROTO_UNKNOWN are already utilized for different _valid_
use cases. I suggest adding PROTO_AUTHORITY or PROTO_AUTHORITY_FORM
instead of adding PROTO_TCP.


* flags.tunneling

AFAIK, you added flags.tunneling specifically to work around the
tunneling request removal from the ConnStateData pipeline (trunk commit
r14418 "Cleanup: Refactor ConnStateData pipeline handling"). The
justification for that removal is given in the corresponding commit
message (thank you, Amos!):

> Initial testing revealed CONNECT tunnels always being logged as ABORTED.
> [..] I have now made the context be finished() just prior to the
> TunnelStateData being destroyed. That way normal closure should show up
> only as TUNNEL, but timeouts and I/O errors should still be recorded as
> abnormal.

That explanation makes sense to me -- when we know that the request has
finished successfully, we should call its finished() method.

AFAICT, the flags.tunneling is currently needed specifically because the
following ConnStateData::checkLogging() condition is [sometimes] false
when the pipeline is empty and we are done tunneling:

> // do not log connections that closed after a transaction (it is normal)
> if (receivedFirstByte_ && inBuf.isEmpty())
> return;

Instead of adding flags.tunneling, can we fix/adjust that condition
and/or the code that affects its components to make the condition true?

receivedFirstByte_: This member remains false if we splice and tunnel
intercepted connections during step1 because we read no bytes at that
time, right? However, it feels like this part of the condition should be
not receivedFirstByte_ but something like thereWasAtLeastOne (i.e., we
pushed at least one request into ConnStateData pipeline). Can we adjust
the pipeline code to remember that at least one request has been queued
and use that instead of receivedFirstByte_ here.

inBuf.isEmpty(): That part of the condition looks correct to me. If it
is false in the tunneling case being discussed, then we need to fix the
code to clean inBuf when tunneling starts.

In summary, can we do something like this instead:

// do not log connections that closed after a transaction (it is normal)
if (pipeline.used() && inBuf.isEmpty())
return;


Thank you,

Alex.


>  const SBuf &
>  HttpRequest::effectiveRequestUri() const
>  {
> -if (method.id() == Http::METHOD_CONNECT)
> +if (method.id() == Http::METHOD_CONNECT || url.getScheme() == 
> AnyP::PROTO_TCP)
>  return url.authority(true); // host:port
>  return url.absolute();
>  }



___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Squid 3.5.23: crash in Comm::DoSelect

2016-10-20 Thread Alex Rousskov
On 10/19/2016 04:59 AM, oleg gv wrote:
> so, it looks like architecture bug that we can't fix, sorry.

No problem. Fixing this bug is far from trivial so your decision to
leave the bug in place for others to fix is certainly understandable.


> but in our case we will try to sync all values to 1024, hope it helps to
> fix bug in squid.

Yes, keeping all limits in sync may work around the bug.

Alex.


> 2016-10-18 17:48 GMT+03:00 Alex Rousskov:
> 
> On 10/18/2016 03:44 AM, oleg gv wrote:
> 
> > nfds=284, so loop ends on 283 and pfds[283] is buggy
> 
> > I/o module is  src/comm/ModPoll.cc, method Comm::DoSelect(int msec)
> > On stack we see that pfds[SQUID_MAXFD=256], so is less than nfds in 
> loop.
> > May be malloc nfds?
> 
> If your maxfd is bigger than SQUID_MAXFD than the bug is elsewhere and
> dynamically allocating pfds is not the right fix (even though it will
> "work").
> 
> I suspect your Squid is creating creating or accepting a descriptor that
> exceeds SQUID_MAXFD-1. Biggest_FD+1 cannot be allowed to exceed the
> misnamed SQUID_MAXFD limit.
> 
> This combination looks like a big red flag to me:
> 
> struct pollfd pfds[SQUID_MAXFD];
> ...
> maxfd = Biggest_FD + 1;
> for (int i = 0; i < maxfd; ++i) {
> ...
> pfds[nfds].fd = i;
> 
> That code is missing assert(maxfd <= SQUID_MAXFD) which will fail in
> your case.
> 
> If you want a workaround, try building Squid with a reasonable number of
> maximum descriptors (e.g., 16K, 32K, or 64K). If that number is never
> reached in your environment, the code will appear to work.
> 
> If you want to try a quick fix, replace SQUID_MAXFD with (Biggest_FD +
> 1) when declaring pfds. You may need to ignore/disable compiler warnings
> about C++ arrays with dynamic sizes. Alternatively, you can allocate
> pfds dynamically (as you suggested).
> 
> If you want to fix the bug, audit all Biggest_FD- and
> SQUID_MAXFD-related code to make sure the two are always in sync.
> 
> 
> HTH,
> 
> Alex.
> 
> 
> 
> > 2016-10-18 8:29 GMT+03:00 Amos Jeffries:
> >
> > FYI: Squid-3.5.23 does not exist yet. What is the output of
> "squid -v" ?
> >
> > On 18/10/2016 5:01 a.m., oleg gv wrote:
> > > I have big traffic (at least 100 computers) , and squid
> often crashed in
> > > Comm::DoSelect(int msec) function.
> > > I have interception mode and NAT redirect.
> > >
> > > In coredump I saw then bug is in next fragment of code:
> > >
> > > 446│ for (size_t loopIndex = 0; loopIndex < nfds;
> ++loopIndex) {
> > > 447│ fde *F;
> > > 448│ int revents = pfds[loopIndex].revents;
> > > 449│ fd = pfds[loopIndex].fd;
> > > 450│
> > > 451│ if (fd == -1)
> > > 452│ continue;
> > > 453│
> > > 454├>if (fd_table[fd].flags.read_pending)
> > > 455│ revents |= POLLIN;
> > >
> > > SIGSEGV occured often (about 1 time in a minute) in line 454
> : fd=-66012128
> > > , loopindex=283
> > >
> > > (gdb) p pfds[282]
> > > $17 = {fd = 291, events = 64, revents = 0}   -- looks ok
> > >
> > > (gdb) p pfds[283]
> > > $18 = {fd = -66012128, events = 32595, revents = 0}  --
> looks strange and
> > > spoiled
> > >
> > > (gdb) p Biggest_FD
> > > $19 = 292
> > >
> >
> > What is the nfds value ?
> >
> > It looks to me like only 282 FD have operations to perform on
> this I/O
> > cycle.
> >
> > What I/O module is being used?
> >
> >  src/comm/ModDevPoll.cc:Comm::DoSelect(int msec)
> >  src/comm/ModPoll.cc:Comm::DoSelect(int msec)
> >
> >
> > Amos
> >
> > ___
> > squid-dev mailing list
> > squid-dev@lists.squid-cache.org
> <mailto:squid-dev@lists.squid-cache.org>
> <mailto:squid-dev@lists.squid-cache.org
> <mailto:squid-dev@lists.squid-cache.org>>
> > http://lists.s

Re: [squid-dev] [PATCH] Refactor wordlist to SBufList in acl/RegexData

2016-10-26 Thread Alex Rousskov
On 10/26/2016 05:18 PM, Kinkie wrote:
>>> the attached patch refactors the use of wordlist to SBufList in
>>> acl/RegexData.cc

> -while (wl != NULL) {
> +for (SBuf i : sl) {

If possible, please avoid creating new SBufs by declaring "i" to be a
constant reference to SBuf. It is probably possible unless you modify
"i" -- I cannot tell for sure by looking at all the "i"s in the patch.

Renaming "i" to something longer and more informative like
"patternOrOption" or even "slowPatternOrOption" would be useful for
those who try to understand what is going on and/or have to find that
variable in a text full of other "i"s.


> +for (auto i : sl) {

Same here, on all counts. Auto does not automatically adds "&" or "const".


What is the time difference in configuring, for example, one million of
regexes using patched and unpatched code?


I have not reviewed the patch closely. I have no objections.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] support Cache-Control:immutable

2016-10-26 Thread Alex Rousskov
On 10/26/2016 05:45 PM, Amos Jeffries wrote:
> This new cache control extension being proposed by Mozilla looks like it
> will be quite useful to us as well as browsers.
> 
> 
> 
> I would like to jump in this early and make Squid-4 be one of the first
> implementations once the middleware caching details are little bit more
> clear. I've already begun discussions to that end in the IETF WG.

I am puzzled why you request comments on this issue. Are there any
hidden negative side effects of supporting this extension? Some trade
offs that we should be discussing/considering?


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


  1   2   3   4   5   6   7   8   9   10   >