Re: [squid-dev] [PATCH] Some failed transactions are not logged

2016-07-18 Thread Amos Jeffries
On 19/07/2016 2:52 a.m., Alex Rousskov wrote:
> On 07/17/2016 03:34 AM, Amos Jeffries wrote:
>> On 16/07/2016 2:40 a.m., Eduard Bagdasaryan wrote:
>>> +// do not log connections that sent us no bytes (TODO: make 
>>> configurable)
>>> +// do not log connections that closed after a transaction (those are 
>>> normal)
>>> +// XXX: We assume that all inBuf consumption cases were covered above.
>>> +if (inBuf.isEmpty())
>>> +return;
> 
>>  - the if (inBuf.isEmpty()) check is not right for "connections that
>> sent us no bytes". 
> 
> Connections that sent us no bytes will have an empty inBuf and will not
> be logged, just like the comment promises. The fact that some _other_
> connections may also have an empty inBuf (and will not be logged) is
> already implied by the second "do not log" comment. I agree that the two
> comments do not cover all cases and so at least one more comment (or
> code restructuring to cover all known cases) is needed, but the existing
> comments appear to be correct.
> 
> 
>> Squid can reach that test on an idle connection
>> between HTTP requests (quite common situation). 
> 
> ... which is already documented by the second "do not log" comment. The
> grouped comments ought to be interpreted together, of course.
> 

It is. But also results in the TODO about implementing logging those
transactions which have not yet reeived reqeust headers / "first byte".
Resolving that TODO is easy and what I'm aiming at here.

> 
>> That case also proves the XXX assumption is wrong.
> 
> The unsubstantiated (hence XXX) assumption is not about sent-nothing
> (after start or after the last request) cases. It is difficult to
> formulate it correctly without writing a whole paragraph, but we are
> trying to say that if something goes into inBuf, then it can only be
> consumed by cases covered by the two if-statements above the XXX. We
> should reword or move that XXX to clarify.
> 
> Perhaps the following sketch that moves XXX up and details it a little
> would work better?
> 
>>  // if we are parsing request body, its request is responsible for logging
>> if (bodyPipe)
>>   return;
>>
>> // a request currently using this connection is responsible for logging
>> if (!pipeline.empty() && pipeline.back()->mayUseConnection())
>>   return;
>>
>> /* Either we are waiting for the very first transaction, or
>>  * we are done with the Nth transaction and are waiting for N+1st.
>>  * XXX: We assume that if anything was added to inBuf, then it could
>>  * only be consumed by actions already covered by the above checks.
>>  */
>>
>> // do not log connections that sent us no bytes (TODO: make configurable)
>> // do not log connections that closed after a transaction (those are normal)
>> // XXX: TLS bytes are consumed without going through inBuf
>> // XXX: PROXY protocol bytes are consumed without going through inBuf
>> if (inBuf.isEmpty())
>>   return;
> 

That describes the situation better. +1.

> 
> We agree that sent-nothing connections should be logged. I am not
> against logging them by default (unless others have evidence to prove
> that it would create too much noise). However, logging sent-nothing
> connections probably requires a configuration option and can be
> considered outside this patch scope -- the patch is improving access
> logging without claiming to fix all access logging bugs.
> 
> If you are sure that logging all sent-nothing connections *without an
> option to disable that behavior* is the right thing to do in v4, then
> let's ask Eduard to cover that case using receivedFirstByte().
> 

Logging them by default is something I think we should do.

As for config controlling them, that was what my questions about the
access_log ACLs abilities were for. Since Eduard has confirmed that
approach is useless, we will need a new config option added for this new
function/method to disable those log entries.


> 
>>   + which still leaves not-logging at least the cases where TLS bytes or
>> such layered things have occured but no HTTP bytes yet (and the inBuf
>> empty right now). That needs an explicit mention, and deserves an XXX or
>> TODO (different than the "assumes ..." one.)
> 
> I agree that TLS should be either handled or explicitly excluded with an
> XXX comment. Another potentially relevant case to handle or explicitly
> exclude is the PROXY protocol, right? I added two XXXs to cover those in
> the above sketch.
> 

Yes, PROXY is a case to note.

Is ftp_port traffic another one?

I think thats all the current ways for a TCP client connection to
happen. I'm assuming UDP ones are out of scope here.

Thank you
Amos

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] Jenkins build is back to normal : trunk-full-matrix » clang,j-fbsd-93 #195

2016-07-18 Thread noc
See 


___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Broken trunk after r14735, r14726

2016-07-18 Thread Alex Rousskov
On 07/18/2016 05:12 AM, Christos Tsantilas wrote:

> I must say that I am worrying a lot for all of these changes.
> It is very difficult for me to follow them, and already I have
> difficulties to read and debug squid openSSL relate code.
> 
> We are using our own naming scheme for openSSL structures, eg
> "Security::SessionPtr" instead of "SSL *" or
> "Security::SessionStatePointer" instead of "SSL_SESSION *" and this is
> make it very difficult to follow Squid/openSSL code.
> 
> It is difficult to read an example openSSL code and trying convert it to
> squid, or reading a reference implementation and trying check against
> squid code. Someone is obligated to search for definitions and
> equivalents types all the time.

> I am just expressing my worries here, I do not want to impose anything
> and if there was a related discussion in squid-dev, sorry that I do not
> participate and I did not express my concerns earlier.


Thank you for courageously documenting your worries and concerns. I
doubt you have missed any critical discussions -- many TLS changes are
committed without review (at best). Unfortunately, I do not see a
practical solution to this problem because

1. There are not enough hours in the day to review everything that
should be reviewed. We do not have enough developers, but the developers
we do have produce enough code to overwhelm even fewer available reviewers.

2. What should be reviewed is itself difficult to identify because many
critical changes (like naming things and changing class hierarchies) are
often hidden behind innocent "cleanup" or "no logic changes"
descriptions that may not even be posted for review.

3. A large portion of reviews are a waste of time because the reviewer
does not understand the problem being solved, the developer does not
understand the problem being solved, and/or the developer is refusing to
fix anything important unless it crashes Squid. And when an agreement is
miraculously reached, the ridiculous amount of time it took to reach
that agreement often negates the benefits of the fixed code.

These problems are not limited to TLS, of course.


Sometimes I wonder whether suspending the whole review process would
result in less pain and more working code overall, despite the fact that
some reviews do result in serious bugs fixed before deployment. However,
suspending all reviews may just open the floodgate of conflicting
designs and crappy code that does not even reach the review stage now
(and so we do not know about it).


If you have specific ideas on how to fix this, please share them, but I
suspect the pain will continue for the foreseeable future because the
problem does not have a solution given the available tools and parameters.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Broken trunk after r14735, r14726

2016-07-18 Thread Christos Tsantilas

On 07/18/2016 08:32 PM, Alex Rousskov wrote:

On 07/18/2016 08:49 AM, Christos Tsantilas wrote:

On 07/18/2016 02:12 PM, Christos Tsantilas wrote:

On 07/16/2016 03:56 PM, Amos Jeffries wrote:

On 16/07/2016 7:02 a.m., Alex Rousskov wrote:

* After r14726 (GnuTLS: support for TLS session resume): Squid
segfaults
when attempting to connect to a Secure ICAP service. Official Squid
v4.0.12 suffers from this bug.



It is a strange crash. Is it a corrupted SSL object?


It is an already freed object (its references member is 0).



The patch uses the following line:
  Security::GetSessionResumeData(Security::SessionPointer(ssl),
icapService->sslSession);


The Security::SessionPointer, is a LockingPointer which in trunk-r14726,
does not increase the references of the attached "SSL" object in
constructor.
So the SSL will be released after the Security::SessionPointer is
destroyed, immediately after the above line executed.


... which would explain the zero "references" value. This raises a
question -- did the old code work because the old LockingPointer
constructor was locking? If the answer is yes, then the
invisible-in-callers change from locking to non-locking was a mistake.


The old LockingPointer constructor was not locking.
This was because most(?) of openSSL functions return pre-locked objects

But looks that the best to avoid the confusion is to allow only methods 
resetAndLock and resetWithoutLock




I can only repeat my earlier suggestions to hide that dangerous
constructor behind an explicit static method like
LockingPointer::NewWithoutLocking() and to assert that
resetWithoutLocking() method is fed a previously locked object.


something like that.






Thank you,

Alex.


___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] "Splicing" bumped requests to resolve\workaround WebSockets issues.

2016-07-18 Thread Alex Rousskov
On 07/17/2016 02:34 PM, Eliezer Croitoru wrote:
> I remember something's vaguely and this is why I didn't quote anything.
> I tried searching for something in the squid-dev list or irc but I couldn't
> found it.

For the future, I hope you will document your vague memories without
saying that somebody else did not present enough details.


On 07/18/2016 12:13 AM, Amos Jeffries wrote:
> that would be nice to have. But is not one of the things holding
> Squid-4 in beta.

Agreed on both counts.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Broken trunk after r14735, r14726

2016-07-18 Thread Alex Rousskov
On 07/18/2016 08:49 AM, Christos Tsantilas wrote:
> On 07/18/2016 02:12 PM, Christos Tsantilas wrote:
>> On 07/16/2016 03:56 PM, Amos Jeffries wrote:
>>> On 16/07/2016 7:02 a.m., Alex Rousskov wrote:
 * After r14726 (GnuTLS: support for TLS session resume): Squid
 segfaults
 when attempting to connect to a Secure ICAP service. Official Squid
 v4.0.12 suffers from this bug.

>> It is a strange crash. Is it a corrupted SSL object?

It is an already freed object (its references member is 0).


> The patch uses the following line:
>  Security::GetSessionResumeData(Security::SessionPointer(ssl),
> icapService->sslSession);
> 
> 
> The Security::SessionPointer, is a LockingPointer which in trunk-r14726,
> does not increase the references of the attached "SSL" object in
> constructor.
> So the SSL will be released after the Security::SessionPointer is
> destroyed, immediately after the above line executed.

... which would explain the zero "references" value. This raises a
question -- did the old code work because the old LockingPointer
constructor was locking? If the answer is yes, then the
invisible-in-callers change from locking to non-locking was a mistake.

I can only repeat my earlier suggestions to hide that dangerous
constructor behind an explicit static method like
LockingPointer::NewWithoutLocking() and to assert that
resetWithoutLocking() method is fed a previously locked object.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Some failed transactions are not logged

2016-07-18 Thread Alex Rousskov
On 07/17/2016 03:34 AM, Amos Jeffries wrote:
> On 16/07/2016 2:40 a.m., Eduard Bagdasaryan wrote:
>> +// do not log connections that sent us no bytes (TODO: make 
>> configurable)
>> +// do not log connections that closed after a transaction (those are 
>> normal)
>> +// XXX: We assume that all inBuf consumption cases were covered above.
>> +if (inBuf.isEmpty())
>> +return;

>  - the if (inBuf.isEmpty()) check is not right for "connections that
> sent us no bytes". 

Connections that sent us no bytes will have an empty inBuf and will not
be logged, just like the comment promises. The fact that some _other_
connections may also have an empty inBuf (and will not be logged) is
already implied by the second "do not log" comment. I agree that the two
comments do not cover all cases and so at least one more comment (or
code restructuring to cover all known cases) is needed, but the existing
comments appear to be correct.


> Squid can reach that test on an idle connection
> between HTTP requests (quite common situation). 

... which is already documented by the second "do not log" comment. The
grouped comments ought to be interpreted together, of course.


> That case also proves the XXX assumption is wrong.

The unsubstantiated (hence XXX) assumption is not about sent-nothing
(after start or after the last request) cases. It is difficult to
formulate it correctly without writing a whole paragraph, but we are
trying to say that if something goes into inBuf, then it can only be
consumed by cases covered by the two if-statements above the XXX. We
should reword or move that XXX to clarify.

Perhaps the following sketch that moves XXX up and details it a little
would work better?

>  // if we are parsing request body, its request is responsible for logging
> if (bodyPipe)
>   return;
> 
> // a request currently using this connection is responsible for logging
> if (!pipeline.empty() && pipeline.back()->mayUseConnection())
>   return;
> 
> /* Either we are waiting for the very first transaction, or
>  * we are done with the Nth transaction and are waiting for N+1st.
>  * XXX: We assume that if anything was added to inBuf, then it could
>  * only be consumed by actions already covered by the above checks.
>  */
> 
> // do not log connections that sent us no bytes (TODO: make configurable)
> // do not log connections that closed after a transaction (those are normal)
> // XXX: TLS bytes are consumed without going through inBuf
> // XXX: PROXY protocol bytes are consumed without going through inBuf
> if (inBuf.isEmpty())
>   return;


We agree that sent-nothing connections should be logged. I am not
against logging them by default (unless others have evidence to prove
that it would create too much noise). However, logging sent-nothing
connections probably requires a configuration option and can be
considered outside this patch scope -- the patch is improving access
logging without claiming to fix all access logging bugs.

If you are sure that logging all sent-nothing connections *without an
option to disable that behavior* is the right thing to do in v4, then
let's ask Eduard to cover that case using receivedFirstByte().


>   + which still leaves not-logging at least the cases where TLS bytes or
> such layered things have occured but no HTTP bytes yet (and the inBuf
> empty right now). That needs an explicit mention, and deserves an XXX or
> TODO (different than the "assumes ..." one.)

I agree that TLS should be either handled or explicitly excluded with an
XXX comment. Another potentially relevant case to handle or explicitly
exclude is the PROXY protocol, right? I added two XXXs to cover those in
the above sketch.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Broken trunk after r14735, r14726

2016-07-18 Thread Christos Tsantilas

On 07/18/2016 02:12 PM, Christos Tsantilas wrote:

On 07/16/2016 03:56 PM, Amos Jeffries wrote:

On 16/07/2016 7:02 a.m., Alex Rousskov wrote:

* After r14726 (GnuTLS: support for TLS session resume): Squid segfaults
when attempting to connect to a Secure ICAP service. Official Squid
v4.0.12 suffers from this bug.


It is a strange crash. Is it a corrupted SSL object?


The patch uses the following line:
 Security::GetSessionResumeData(Security::SessionPointer(ssl), 
icapService->sslSession);



The Security::SessionPointer, is a LockingPointer which in trunk-r14726, 
does not increase the references of the attached "SSL" object in 
constructor.
So the SSL will be released after the Security::SessionPointer is 
destroyed, immediately after the above line executed.










___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] Jenkins build is back to normal : trunk-matrix » clang,rs-fbsd-10 #694

2016-07-18 Thread noc
See 


___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Some failed transactions are not logged

2016-07-18 Thread Eduard Bagdasaryan

Hello Amos,

2016-07-17 12:34 GMT+03:00 Amos Jeffries :
>  can these URI be logged with url_regex ACLs on access_log lines?

No because ACLUrlStrategy::requiresRequest() returns true.

> why is pipeline.back() being checked instead of pipeline.front() ?
> [not saying its wrong yet, just asking for a good reason.]

Since the pipeline acts as a FIFO we need the last-added pipeline
element to check whether it uses connection. I.e., if we pipelined
several requests, the CONNECT request (if we got it) will be the last,
because it starts using connection.

> logAcceptError() should use a distinct "error:" URI labeling this as a
> TCP accept error, since it has nothing to do with HTTP state.
Something like "error:user-connection-establishment" or
"error:open-user-connection"?  What do you think?


Eduard.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Broken trunk after r14735, r14726

2016-07-18 Thread Christos Tsantilas

On 07/16/2016 03:56 PM, Amos Jeffries wrote:

On 16/07/2016 7:02 a.m., Alex Rousskov wrote:

Hello,

 There are two more recent changes that broke trunk:

* After r14735 (Replaced TidyPointer with std::unique_ptr), Squid cannot
start due to an "std::bad_function_call" exception.

* After r14726 (GnuTLS: support for TLS session resume): Squid segfaults
when attempting to connect to a Secure ICAP service. Official Squid
v4.0.12 suffers from this bug.

Stack traces from both crashes are quoted at the end of this email.

Please fix these regressions or undo the changes that created or exposed
them.





* segfault when attempting to connect to a Secure ICAP REQMOD service
(tested with r14726, r14734):



Does this patch fix the session issue ?

=== modified file 'src/security/Session.cc'
--- src/security/Session.cc 2016-07-07 19:03:02 +
+++ src/security/Session.cc 2016-07-16 12:43:38 +
@@ -53,7 +53,7 @@
  void
  Security::SetSessionResumeData(const Security::SessionPtr , const
Security::SessionStatePointer )
  {
-if (s) {
+if (data) {
  #if USE_OPENSSL
  (void)SSL_set_session(s, data.get());
  #elif USE_GNUTLS


I'm a little worried about the code calling SetSessionResumeData.
OpenSSL documentation states:
   "If there is already a session set inside ssl (because it was set with
SSL_set_session() before or because the same ssl was already used for a
connection), SSL_SESSION_free() will be called for that session."

But our SetSessionResumeData() is called after setting up the sessions
host data, etc. So I'm thinking all that setup in
Ssl::BlindPeerConnector::initializeTls() may be thrown away by the
resume action being called.



Squid crashes at the first TLS connection trying to establish to the 
ICAP server.  There is not any session to resume so the SSL session 
related methods should not called at all.


It is a strange crash. Is it a corrupted SSL object?

I must say that I am worrying a lot for all of these changes.
It is very difficult for me to follow them, and already I have 
difficulties to read and debug squid openSSL relate code.


We are using our own naming scheme for openSSL structures, eg 
"Security::SessionPtr" instead of "SSL *" or 
"Security::SessionStatePointer" instead of "SSL_SESSION *" and this is 
make it very difficult to follow Squid/openSSL code.


It is difficult to read an example openSSL code and trying convert it to 
squid, or reading a reference implementation and trying check against 
squid code. Someone is obligated to search for definitions and 
equivalents types all the time.


I have not ever study for a good way to support both GnuTLS and openSSL 
for squid, to know the problems,  but probably I would start to 
implement openSSLAcceptor/gnuTlsAcceptor and 
openSSLPeerConnector/gnuTLSPeerConnector classes.
Internally gnuTLS and openSSL should use their own types. This is will 
help current and future squid developers.


I am just expressing my worries here, I do not want to impose anything 
and if there was a related discussion in squid-dev, sorry that I do not 
participate and I did not express my concerns earlier.


Regards,
   Christos



___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] reduce MISS on transients collision

2016-07-18 Thread Amos Jeffries
On 18/07/2016 6:59 a.m., Alex Rousskov wrote:
> On 07/17/2016 05:01 AM, Amos Jeffries wrote:
>> I've just been looking at the Store::Controller::find() implementation
>> and it struck me that if the transients lookup has an error the object
>> will fail to HIT on any existing cache entries.
> 
> If the transients table tells us that the transient object is in a "bad"
> state, then trying to load that same object from a store will fail at
> best or result in a stale/truncated/stuck response at worse.
> 
> 
>> Alex; am I missing something undocumented here ?
> 
> You might be missing something documented: The Transients definition or
> purpose. Transients is not one of the many "if not here than possibly
> there" cache stores. Transients is dedicated to cache entries in
> transient state. If a given entry can be correctly loaded from a regular
> cache store, then, by definition, that entry is not transient [any more]
> and would not be in Transients. Consequently, if the entry is in
> Transients, then it is impossible to load it correctly from a regular store.
> 
> It is possible that a lock contention or a similar SMP race condition
> inside Transients would result in a cache miss instead of reading from a
> Transients-controlled entry, but, bugs notwithstanding, that should not
> happen often.

Aha, thanks.

Amos

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] Broken trunk after r14735, r14726

2016-07-18 Thread Amos Jeffries
On 17/07/2016 4:35 a.m., Alex Rousskov wrote:
> On 07/16/2016 06:56 AM, Amos Jeffries wrote:
>> On 16/07/2016 7:02 a.m., Alex Rousskov wrote:
>>> * After r14726 (GnuTLS: support for TLS session resume): Squid segfaults
>>> when attempting to connect to a Secure ICAP service. Official Squid
>>> v4.0.12 suffers from this bug.
> 
>>> * segfault when attempting to connect to a Secure ICAP REQMOD service
>>> (tested with r14726, r14734):
> 
>> Does this patch fix the session issue ?
> 
> No, it does not fix the bug in my test. Same backtrace, same
> "impossible" zero ssl->references value.
> 
> Please note that it is easy to test this using stunnel. You do not even
> need an ICAP server AFAICT -- any server (e.g., nc) would work because
> the bug is triggered before the ICAP request is sent.
> 
> 
> icap_enable on
> icap_service service_req reqmod_precache bypass=0
> icaps://127.0.0.1:1345/request
> adaptation_access service_req allow all
> 
> $ sudo stunnel -f -d 127.0.0.1:1345 -r 127.0.0.1:1344 -p ssl/CA-priv+pub.pem
> 
> 
> 
>> I'm a little worried about the code calling SetSessionResumeData.
>> OpenSSL documentation states:
>>   "If there is already a session set inside ssl (because it was set with
>> SSL_set_session() before or because the same ssl was already used for a
>> connection), SSL_SESSION_free() will be called for that session."
>>
>> But our SetSessionResumeData() is called after setting up the sessions
>> host data, etc. So I'm thinking all that setup in
>> Ssl::BlindPeerConnector::initializeTls() may be thrown away by the
>> resume action being called.
> 
> I cannot validate your concerns without doing a lot of research but I
> hope that Christos will weigh in.
> 
> Alex.
> 


Thanks. I'm reverting the session change from trunk until I can
replicate and do better testing of it.

Amos

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev