[jira] [Updated] (TS-1036) Improve squid log compatibility

2011-12-12 Thread Leif Hedstrom (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-1036:
--

Fix Version/s: 3.1.2

> Improve squid log compatibility 
> 
>
> Key: TS-1036
> URL: https://issues.apache.org/jira/browse/TS-1036
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Logging
>Reporter: Leif Hedstrom
> Fix For: 3.1.2
>
>
> See 
> https://github.com/mnot/squidpeek/commit/3874cb902f257974d16c8eae5fc5a77c6fafbf69
>   for some "differences", from mnot as well:
> all of the ERR_* ones
> squid does TCP_REFRESH_FAIL_HIT, you do TCP_REF_FAIL_HIT
> squid does TCP_CLIENT_REFRESH_MISS, you do TCP_CLIENT_REFRESH
> squid does TCP_SWAPFAIL_MISS, you do TCP_SWAPFAIL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1045) PATCH: add new TSFetchHdrGet API

2011-12-12 Thread Leif Hedstrom (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-1045:
--

Fix Version/s: 3.1.2

> PATCH: add new TSFetchHdrGet API
> 
>
> Key: TS-1045
> URL: https://issues.apache.org/jira/browse/TS-1045
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: HTTP
>Reporter: James Peach
>Priority: Minor
> Fix For: 3.1.2
>
> Attachments: 0007-Add-new-public-API-TSFetchHdrGet.patch
>
>
> TSFetchUrl does not provide any way to get the headers from the result. This 
> patch adds a new API TSFetchHdrGet(), which is analogous to TSFetchRespGet() 
> and returns the headers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1048) Add TS API to enable plugins to use traffic server configuration infrastructure

2011-12-12 Thread Leif Hedstrom (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-1048:
--

Fix Version/s: 3.1.2

> Add TS API to enable plugins to use traffic server configuration 
> infrastructure 
> 
>
> Key: TS-1048
> URL: https://issues.apache.org/jira/browse/TS-1048
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Configuration, TS API
> Environment: Centos 6
>Reporter: bianca cooper
>Priority: Minor
>  Labels: api-addition, configuration
> Fix For: 3.1.2
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Export RecRegisterConfigInt and RecRegisterConfigString to enable adding a 
> configuration record to the records hashtable. 
> Once plugin new configuration records should be added, the addition will be 
> done by calling the API in the plugin code. No need to add the record to 
> RecordsConfig static array. No need to recompile the ATS each time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1035) EventProcessor::spawn_thread doesn't check that there is enough event threads and segfaults

2011-12-12 Thread Leif Hedstrom (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-1035:
--

Fix Version/s: 3.1.2

> EventProcessor::spawn_thread doesn't check that there is enough event threads 
> and segfaults
> ---
>
> Key: TS-1035
> URL: https://issues.apache.org/jira/browse/TS-1035
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 3.0.1
>Reporter: Brian Geffon
> Fix For: 3.1.2
>
> Attachments: LargeNumberOfPorts.patch, UnixEventProcessor.patch
>
>
> The easiest way to see this bug is to use several hundred ports with accept 
> threads turned on. The bug exists because in I_EventProcessor.h there is a 
> hard coded limit of 512 event threads and there is no check in spawn_thread 
> that you haven't exceeded that limit so it will result in a segfault if 
> you're creating too many threads. From what I can tell the best solution is 
> an assertion that you haven't exceeded MAX_EVENT_THREADS.
> Patch is included.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1043) PATCH: teach TSFetchUrl to use the content-length to find the after_body event

2011-12-12 Thread Leif Hedstrom (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-1043:
--

Fix Version/s: 3.1.2

> PATCH: teach TSFetchUrl to use the content-length to find the after_body event
> --
>
> Key: TS-1043
> URL: https://issues.apache.org/jira/browse/TS-1043
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: HTTP
>Reporter: James Peach
> Fix For: 3.1.2
>
> Attachments: 
> 0005-TSFetchUrl-use-content-length-to-fire-after_body-eve.patch
>
>
> TSFetchUrl() does not fire the after_body event until the TCP connection is 
> closed. The fix is to check the content-length when we receive more bytes and 
> to fire the after_body event when all the byte are received.
> This looks like https://issues.apache.org/jira/browse/TS-817 and possibly 
> https://issues.apache.org/jira/browse/TS-912

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1047) Several spelling fixes in strings

2011-12-12 Thread Leif Hedstrom (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-1047:
--

Fix Version/s: 3.1.2

> Several spelling fixes in strings
> -
>
> Key: TS-1047
> URL: https://issues.apache.org/jira/browse/TS-1047
> Project: Traffic Server
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Arno Toell
>Priority: Trivial
> Fix For: 3.1.2
>
> Attachments: spelling-fixes.diff
>
>
> The current trunk contains several spelling fixes in strings printed out to 
> users in some cases. I'm attaching a patch fixing some of the most obvious 
> spelling errors I noticed (or rather, Lintian the Debian static analysis tool 
> blamed me for :)) you may want to merge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1044) PATCH: Fix TSVConn{Read,Write}VIOGet in UnixNetVConnection.

2011-12-12 Thread Leif Hedstrom (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-1044:
--

Fix Version/s: 3.1.2

> PATCH: Fix TSVConn{Read,Write}VIOGet in UnixNetVConnection.
> ---
>
> Key: TS-1044
> URL: https://issues.apache.org/jira/browse/TS-1044
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: James Peach
> Fix For: 3.1.2
>
> Attachments: 
> 0006-Fix-TSVConn-Read-Write-VIOGet-in-UnixNetVConnection.patch
>
>
> UnixNetVConnection does not actually implement the virtual interface 
> necessary to support the TSVConn{Read,Write}VIOGet() APIs. Even worse, the 
> API layer assumes that this can't fail and proceeds to return a pointer to 
> stack junk.
> This patch implements TSVConn{Read,Write}VIOGet() for UnixNetVConnection and 
> allows the API to return NULL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1040) PATCH: teach TSHostLookup to use const

2011-12-12 Thread Leif Hedstrom (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-1040:
--

Fix Version/s: 3.1.2

> PATCH: teach TSHostLookup to use const
> --
>
> Key: TS-1040
> URL: https://issues.apache.org/jira/browse/TS-1040
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: DNS
>Reporter: James Peach
>Priority: Minor
> Fix For: 3.1.2
>
> Attachments: 
> 0002-TSHostLookup-should-take-const-hostname-argument.patch
>
>
> This patch improves the TSHostLookup() API by specifying it's hostname 
> argument as const. This reduces the number of casts required of plugin 
> authors.
> The new prototype is:
> tsapi TSAction TSHostLookup(TSCont contp, const char* hostname, size_t 
> namelen)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (TS-1047) Several spelling fixes in strings

2011-12-12 Thread Leif Hedstrom (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom reassigned TS-1047:
-

Assignee: Leif Hedstrom

> Several spelling fixes in strings
> -
>
> Key: TS-1047
> URL: https://issues.apache.org/jira/browse/TS-1047
> Project: Traffic Server
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Arno Toell
>Assignee: Leif Hedstrom
>Priority: Trivial
> Fix For: 3.1.2
>
> Attachments: spelling-fixes.diff
>
>
> The current trunk contains several spelling fixes in strings printed out to 
> users in some cases. I'm attaching a patch fixing some of the most obvious 
> spelling errors I noticed (or rather, Lintian the Debian static analysis tool 
> blamed me for :)) you may want to merge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (TS-1048) Add TS API to enable plugins to use traffic server configuration infrastructure

2011-12-12 Thread Leif Hedstrom (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom reassigned TS-1048:
-

Assignee: Leif Hedstrom

> Add TS API to enable plugins to use traffic server configuration 
> infrastructure 
> 
>
> Key: TS-1048
> URL: https://issues.apache.org/jira/browse/TS-1048
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Configuration, TS API
> Environment: Centos 6
>Reporter: bianca cooper
>Assignee: Leif Hedstrom
>Priority: Minor
>  Labels: api-addition, configuration
> Fix For: 3.1.2
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Export RecRegisterConfigInt and RecRegisterConfigString to enable adding a 
> configuration record to the records hashtable. 
> Once plugin new configuration records should be added, the addition will be 
> done by calling the API in the plugin code. No need to add the record to 
> RecordsConfig static array. No need to recompile the ATS each time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1041) PATCH: guarantee to populate sockaddr length for TSHostLookupResultAddrGet

2011-12-12 Thread Leif Hedstrom (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-1041:
--

Fix Version/s: 3.1.2

> PATCH: guarantee to populate sockaddr length for TSHostLookupResultAddrGet
> --
>
> Key: TS-1041
> URL: https://issues.apache.org/jira/browse/TS-1041
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: DNS
> Environment: Mac OS X 10.7
>Reporter: James Peach
>Priority: Minor
> Fix For: 3.1.2
>
> Attachments: 0003-Ensure-sockaddr-length-is-always-populated.patch
>
>
> The sockaddr returned by TSHostLookupResultAddrGet() does not always get it's 
> sa_len field populated correctly. This patch guarantees to populate it to the 
> correct value so that plugin authors can rely on that field when copying the 
> TSHostLookupResultAddrGet() result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (TS-1045) PATCH: add new TSFetchHdrGet API

2011-12-12 Thread Leif Hedstrom (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom reassigned TS-1045:
-

Assignee: Leif Hedstrom

> PATCH: add new TSFetchHdrGet API
> 
>
> Key: TS-1045
> URL: https://issues.apache.org/jira/browse/TS-1045
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: HTTP
>Reporter: James Peach
>Assignee: Leif Hedstrom
>Priority: Minor
> Fix For: 3.1.2
>
> Attachments: 0007-Add-new-public-API-TSFetchHdrGet.patch
>
>
> TSFetchUrl does not provide any way to get the headers from the result. This 
> patch adds a new API TSFetchHdrGet(), which is analogous to TSFetchRespGet() 
> and returns the headers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (TS-1047) Several spelling fixes in strings

2011-12-12 Thread Assigned

 [ 
https://issues.apache.org/jira/browse/TS-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Galić reassigned TS-1047:
--

Assignee: Igor Galić  (was: Leif Hedstrom)

> Several spelling fixes in strings
> -
>
> Key: TS-1047
> URL: https://issues.apache.org/jira/browse/TS-1047
> Project: Traffic Server
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Arno Toell
>Assignee: Igor Galić
>Priority: Trivial
> Fix For: 3.1.2, 3.0.3
>
> Attachments: spelling-fixes.diff
>
>
> The current trunk contains several spelling fixes in strings printed out to 
> users in some cases. I'm attaching a patch fixing some of the most obvious 
> spelling errors I noticed (or rather, Lintian the Debian static analysis tool 
> blamed me for :)) you may want to merge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (TS-1047) Several spelling fixes in strings

2011-12-12 Thread Resolved

 [ 
https://issues.apache.org/jira/browse/TS-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Galić resolved TS-1047.


   Resolution: Fixed
Fix Version/s: 3.0.3

> Several spelling fixes in strings
> -
>
> Key: TS-1047
> URL: https://issues.apache.org/jira/browse/TS-1047
> Project: Traffic Server
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Arno Toell
>Assignee: Igor Galić
>Priority: Trivial
> Fix For: 3.1.2, 3.0.3
>
> Attachments: spelling-fixes.diff
>
>
> The current trunk contains several spelling fixes in strings printed out to 
> users in some cases. I'm attaching a patch fixing some of the most obvious 
> spelling errors I noticed (or rather, Lintian the Debian static analysis tool 
> blamed me for :)) you may want to merge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (TS-254) Add TSEscapifyString() and TSUnescapifyString()

2011-12-12 Thread Assigned

 [ 
https://issues.apache.org/jira/browse/TS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Galić reassigned TS-254:
-

Assignee: Igor Galić  (was: Leif Hedstrom)

> Add TSEscapifyString() and TSUnescapifyString() 
> 
>
> Key: TS-254
> URL: https://issues.apache.org/jira/browse/TS-254
> Project: Traffic Server
>  Issue Type: New Feature
>  Components: TS API
>Affects Versions: 3.0.0
>Reporter: Leif Hedstrom
>Assignee: Igor Galić
>Priority: Minor
> Fix For: 3.1.4
>
>
> It would be very convenient for plugin developers to have SDK APIs that 
> allows for escaping and unescaping of strings. E.g.
> TSEscapifyString("http://www.ogre.com/ogre.png";)
>  ->  http%3A%2F%2Fwww.ogre.com%2Fogre.png
> TSUnescapifyString("http%3A%2F%2Fwww.ogre.com%2Fogre.png)
>  -> http://www.ogre.com/ogre.png
> The "unescapify" string is fairly straight forward, but the "escapify" 
> version might benefit from taking an (optional) table which describes what 
> characters needs to be escaped (e.g. in in some cases you want a / to be 
> escaped, but in others you do not).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-254) Add TSEscapifyString() and TSUnescapifyString()

2011-12-12 Thread Commented

[ 
https://issues.apache.org/jira/browse/TS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167589#comment-13167589
 ] 

Igor Galić commented on TS-254:
---

Following 
http://mail-archives.apache.org/mod_mbox/trafficserver-dev/201112.mbox/%3c4ee61d20.2000...@ogre.com%3e
 I'm assigning the bug to me and will start fiddling around

> Add TSEscapifyString() and TSUnescapifyString() 
> 
>
> Key: TS-254
> URL: https://issues.apache.org/jira/browse/TS-254
> Project: Traffic Server
>  Issue Type: New Feature
>  Components: TS API
>Affects Versions: 3.0.0
>Reporter: Leif Hedstrom
>Assignee: Igor Galić
>Priority: Minor
> Fix For: 3.1.4
>
>
> It would be very convenient for plugin developers to have SDK APIs that 
> allows for escaping and unescaping of strings. E.g.
> TSEscapifyString("http://www.ogre.com/ogre.png";)
>  ->  http%3A%2F%2Fwww.ogre.com%2Fogre.png
> TSUnescapifyString("http%3A%2F%2Fwww.ogre.com%2Fogre.png)
>  -> http://www.ogre.com/ogre.png
> The "unescapify" string is fairly straight forward, but the "escapify" 
> version might benefit from taking an (optional) table which describes what 
> characters needs to be escaped (e.g. in in some cases you want a / to be 
> escaped, but in others you do not).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-949) key->volume hash table is not consistent when a disk is marked as bad or removed due to failure

2011-12-12 Thread B Wyatt (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167597#comment-13167597
 ] 

B Wyatt commented on TS-949:


Thanks John.  I think the new patch should be more stable.  I apologize for the 
misread of the previous patch, all of my volumes are matched in size so I had 
erroneously tuned out the inclusion of vol->len in the initial value of 
forvol[i].

While I am not an enforcer of code quality, I think the particulars of this 
method should at the very least be documented in the patched code.  I'll let 
someone else decide whether it is worth the effort to "pretty" it up.

All of this digging has brought up a new related issue (that I am pretty sure 
we cannot address at this level): Object loss when adding volumes.  The hash is 
now consistent, however when a new volume supersedes an existing volume in the 
hash, any object that maps to that bucket but currently stored on the old 
volume will become inaccessible.  I will probably create a new issue for that 
as this one is solved in my book.  

> key->volume hash table is not consistent when a disk is marked as bad or 
> removed due to failure
> ---
>
> Key: TS-949
> URL: https://issues.apache.org/jira/browse/TS-949
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache
>Affects Versions: 3.1.0
> Environment: Multi-volume cache with apparently faulty drives
>Reporter: B Wyatt
>Assignee: John Plevyak
> Fix For: 3.1.2
>
> Attachments: TS-949-jp-1.patch, TS-949-jp2.patch, TS949-BW-p1.patch
>
>
> The method for resolving collisions when distributing hash-table space to 
> volumes for the object_key->volume hash table creates inconsistency when a 
> disk is determined to be bad, or when a failed disk is removed from the 
> volume.config.
> Background:
> The hash space is distributed by round robin draft where each volume "drafts" 
> a random index in the hash table until the hash space is exhausted.  The 
> random order in which a given volume drafts hash table slots is consistent 
> across reboot/crash/disk-failure, however when a volume attempts to draft a 
> slot which has already been occupied, it skips to its next random pick and 
> attempts to draft that slot until it finds an open slot.  This ensures that 
> the hash is partitioned evenly between volumes.
> The issue:
> Resolving slot contention breaks the consistency as it is dependent on the 
> order that the volumes draft.  When rebuilding the hash after disk failure or 
> reboot with fewer drives, a volume may secure an index that was previously 
> occupied by the dead-disk.  In the old hash, the surviving volume would have 
> selected another random index due to contention.  If this index is taken, by 
> the next draft round it will represent an inconsistent key->volume result.  
> The effects of one inconsistency will then cascade as whichever volume 
> occupies that index after removing a dead disk is now behind on its draft 
> sequence as well. 
> An Example:
> ||Disk||Draft Sequence||
> |A|1,4,7,5|
> |B|4,2,8,1|
> |C|3,7,5,2|
> Pre-failure Hash Table after 2 rounds of draft:
> |A|B|C|B|C|?|A|?|
> Post-failure of drive B Hash Table after 3 rounds of draft:
> |A|C|C|A|{color:red}A{color}|?|{color:red}C{color}|?|
> Two slots have become inconsistent and more will probably follow.  These 
> inconsistencies become objects stored in a volume but lost to the top level 
> cache for open/lookup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (TS-949) key->volume hash table is not consistent when a disk is marked as bad or removed due to failure

2011-12-12 Thread B Wyatt (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167597#comment-13167597
 ] 

B Wyatt edited comment on TS-949 at 12/12/11 5:05 PM:
--

Thanks John.  I think the new patch should be more stable.  I apologize for the 
misread of the previous patch, all of my volumes are matched in size so I had 
erroneously tuned out the inclusion of vol->len in the initial value of 
forvol[i].

All of this digging has brought up a new related issue (that I am pretty sure 
we cannot address at this level): Object loss when adding volumes.  The hash is 
now consistent, however when a new volume supersedes an existing volume in the 
hash, any object that maps to that bucket but currently stored on the old 
volume will become inaccessible.  I will probably create a new issue for that 
as this one is solved in my book.  

[Edit: removed statement about comments... they are there... my coffee is not, 
apparently]

  was (Author: wanderingbort):
Thanks John.  I think the new patch should be more stable.  I apologize for 
the misread of the previous patch, all of my volumes are matched in size so I 
had erroneously tuned out the inclusion of vol->len in the initial value of 
forvol[i].

While I am not an enforcer of code quality, I think the particulars of this 
method should at the very least be documented in the patched code.  I'll let 
someone else decide whether it is worth the effort to "pretty" it up.

All of this digging has brought up a new related issue (that I am pretty sure 
we cannot address at this level): Object loss when adding volumes.  The hash is 
now consistent, however when a new volume supersedes an existing volume in the 
hash, any object that maps to that bucket but currently stored on the old 
volume will become inaccessible.  I will probably create a new issue for that 
as this one is solved in my book.  
  
> key->volume hash table is not consistent when a disk is marked as bad or 
> removed due to failure
> ---
>
> Key: TS-949
> URL: https://issues.apache.org/jira/browse/TS-949
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache
>Affects Versions: 3.1.0
> Environment: Multi-volume cache with apparently faulty drives
>Reporter: B Wyatt
>Assignee: John Plevyak
> Fix For: 3.1.2
>
> Attachments: TS-949-jp-1.patch, TS-949-jp2.patch, TS949-BW-p1.patch
>
>
> The method for resolving collisions when distributing hash-table space to 
> volumes for the object_key->volume hash table creates inconsistency when a 
> disk is determined to be bad, or when a failed disk is removed from the 
> volume.config.
> Background:
> The hash space is distributed by round robin draft where each volume "drafts" 
> a random index in the hash table until the hash space is exhausted.  The 
> random order in which a given volume drafts hash table slots is consistent 
> across reboot/crash/disk-failure, however when a volume attempts to draft a 
> slot which has already been occupied, it skips to its next random pick and 
> attempts to draft that slot until it finds an open slot.  This ensures that 
> the hash is partitioned evenly between volumes.
> The issue:
> Resolving slot contention breaks the consistency as it is dependent on the 
> order that the volumes draft.  When rebuilding the hash after disk failure or 
> reboot with fewer drives, a volume may secure an index that was previously 
> occupied by the dead-disk.  In the old hash, the surviving volume would have 
> selected another random index due to contention.  If this index is taken, by 
> the next draft round it will represent an inconsistent key->volume result.  
> The effects of one inconsistency will then cascade as whichever volume 
> occupies that index after removing a dead disk is now behind on its draft 
> sequence as well. 
> An Example:
> ||Disk||Draft Sequence||
> |A|1,4,7,5|
> |B|4,2,8,1|
> |C|3,7,5,2|
> Pre-failure Hash Table after 2 rounds of draft:
> |A|B|C|B|C|?|A|?|
> Post-failure of drive B Hash Table after 3 rounds of draft:
> |A|C|C|A|{color:red}A{color}|?|{color:red}C{color}|?|
> Two slots have become inconsistent and more will probably follow.  These 
> inconsistencies become objects stored in a volume but lost to the top level 
> cache for open/lookup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-949) key->volume hash table is not consistent when a disk is marked as bad or removed due to failure

2011-12-12 Thread B Wyatt (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

B Wyatt updated TS-949:
---

Attachment: explicit-pair.patch

I made a quick patch which converts the implied pairing of elements in the 
rtable array into an explicit pair (it applies on top of TS-949-jp2.patch).  

It is a non-functional change however, I thought it may make future 
review/modification a little easier.

Feel free to toss it in the circular file, it won't hurt my feelings. 

> key->volume hash table is not consistent when a disk is marked as bad or 
> removed due to failure
> ---
>
> Key: TS-949
> URL: https://issues.apache.org/jira/browse/TS-949
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache
>Affects Versions: 3.1.0
> Environment: Multi-volume cache with apparently faulty drives
>Reporter: B Wyatt
>Assignee: John Plevyak
> Fix For: 3.1.2
>
> Attachments: TS-949-jp-1.patch, TS-949-jp2.patch, TS949-BW-p1.patch, 
> explicit-pair.patch
>
>
> The method for resolving collisions when distributing hash-table space to 
> volumes for the object_key->volume hash table creates inconsistency when a 
> disk is determined to be bad, or when a failed disk is removed from the 
> volume.config.
> Background:
> The hash space is distributed by round robin draft where each volume "drafts" 
> a random index in the hash table until the hash space is exhausted.  The 
> random order in which a given volume drafts hash table slots is consistent 
> across reboot/crash/disk-failure, however when a volume attempts to draft a 
> slot which has already been occupied, it skips to its next random pick and 
> attempts to draft that slot until it finds an open slot.  This ensures that 
> the hash is partitioned evenly between volumes.
> The issue:
> Resolving slot contention breaks the consistency as it is dependent on the 
> order that the volumes draft.  When rebuilding the hash after disk failure or 
> reboot with fewer drives, a volume may secure an index that was previously 
> occupied by the dead-disk.  In the old hash, the surviving volume would have 
> selected another random index due to contention.  If this index is taken, by 
> the next draft round it will represent an inconsistent key->volume result.  
> The effects of one inconsistency will then cascade as whichever volume 
> occupies that index after removing a dead disk is now behind on its draft 
> sequence as well. 
> An Example:
> ||Disk||Draft Sequence||
> |A|1,4,7,5|
> |B|4,2,8,1|
> |C|3,7,5,2|
> Pre-failure Hash Table after 2 rounds of draft:
> |A|B|C|B|C|?|A|?|
> Post-failure of drive B Hash Table after 3 rounds of draft:
> |A|C|C|A|{color:red}A{color}|?|{color:red}C{color}|?|
> Two slots have become inconsistent and more will probably follow.  These 
> inconsistencies become objects stored in a volume but lost to the top level 
> cache for open/lookup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-949) key->volume hash table is not consistent when a disk is marked as bad or removed due to failure

2011-12-12 Thread John Plevyak (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167680#comment-13167680
 ] 

John Plevyak commented on TS-949:
-

I agree that this code is too raw.  I wanted to get the bones of a solution out 
there, but I am definitely not wedded to the implementation.

RE: when a new volume is added; one solution is to probe back into previous 
configurations (rather than, say, just the second most likely location).  This 
is the approach that the clustering code takes (see cluster/ClusterConfig.cc 
configuration_add_machine, cluster_machine_depth_list).

I think that this code and that code should be merged.   The new hash table 
generator from this code combined with the history mechanism from that code.
The alternative in both cases is to just return the first N most likely 
locations.  This is probably OK for the cache because it would be a local 
in-memory probe 99.9% of the time, but would more expensive for clustering as 
it would require going off-node 100% of the time.

> key->volume hash table is not consistent when a disk is marked as bad or 
> removed due to failure
> ---
>
> Key: TS-949
> URL: https://issues.apache.org/jira/browse/TS-949
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache
>Affects Versions: 3.1.0
> Environment: Multi-volume cache with apparently faulty drives
>Reporter: B Wyatt
>Assignee: John Plevyak
> Fix For: 3.1.2
>
> Attachments: TS-949-jp-1.patch, TS-949-jp2.patch, TS949-BW-p1.patch, 
> explicit-pair.patch
>
>
> The method for resolving collisions when distributing hash-table space to 
> volumes for the object_key->volume hash table creates inconsistency when a 
> disk is determined to be bad, or when a failed disk is removed from the 
> volume.config.
> Background:
> The hash space is distributed by round robin draft where each volume "drafts" 
> a random index in the hash table until the hash space is exhausted.  The 
> random order in which a given volume drafts hash table slots is consistent 
> across reboot/crash/disk-failure, however when a volume attempts to draft a 
> slot which has already been occupied, it skips to its next random pick and 
> attempts to draft that slot until it finds an open slot.  This ensures that 
> the hash is partitioned evenly between volumes.
> The issue:
> Resolving slot contention breaks the consistency as it is dependent on the 
> order that the volumes draft.  When rebuilding the hash after disk failure or 
> reboot with fewer drives, a volume may secure an index that was previously 
> occupied by the dead-disk.  In the old hash, the surviving volume would have 
> selected another random index due to contention.  If this index is taken, by 
> the next draft round it will represent an inconsistent key->volume result.  
> The effects of one inconsistency will then cascade as whichever volume 
> occupies that index after removing a dead disk is now behind on its draft 
> sequence as well. 
> An Example:
> ||Disk||Draft Sequence||
> |A|1,4,7,5|
> |B|4,2,8,1|
> |C|3,7,5,2|
> Pre-failure Hash Table after 2 rounds of draft:
> |A|B|C|B|C|?|A|?|
> Post-failure of drive B Hash Table after 3 rounds of draft:
> |A|C|C|A|{color:red}A{color}|?|{color:red}C{color}|?|
> Two slots have become inconsistent and more will probably follow.  These 
> inconsistencies become objects stored in a volume but lost to the top level 
> cache for open/lookup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

2011-12-12 Thread Wilson Ho (Created) (JIRA)
TS hangs (dead lock) on HTTPS POST requests
---

 Key: TS-1049
 URL: https://issues.apache.org/jira/browse/TS-1049
 Project: Traffic Server
  Issue Type: Bug
  Components: Core, HTTP, SSL
Affects Versions: 3.1.0, 3.1.1, 3.0.2
 Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
Reporter: Wilson Ho
Priority: Blocker


A very reproducible bug where the body of a HTTPS POST request is never 
forwarded to the origin server.

Client submits a HTTPS POST request to TS, which is supposed to forward to the 
backend/origin server via HTTP.  TS process the HTTP headers and establishes 
connection to the origin server, but the body of the HTTPS POST is never read.  
This hangs until the client times out and shuts down the connection.

To reproduce:
1) Client connects to TS using HTTPS (works OK if it is just HTTP).
2) It must be a POST request.
3) TS must use at least 2 worker threads.
4) Easier to reproduce when the connections to the origin server is HTTP (not 
HTTPS).
5) POST body must be large enough so that the HTTP request headers and POST 
body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
6) I can consistently reproduce this problem using 2 separate clients each 
simultaneously submitting 2 requests back to back (i.e., 2 requests from each 
client, a total of 4 requests).  This gives you a high probability that at 
least one of the requests would hang.

Observation:
1) Thread A accepted and processed the HTTP headers, and called 
"UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
2) Thread A must not have read the body of the POST.  Otherwise, it works fine.
3) Thread B was assigned the task to handle the origin server connection.  If 
the same thread A was picked, then everything works fine.
4) Apparently, one of the first things that thread B does is to acquire the 
mutex for reading from the client.  (Why does it do that??)
5) While thread B was holding the mutex, thread A proceeded in 
"SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  
Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but 
gave up after the second failure. But if thread B released the mutex soon 
enough, that thread A could proceed happily and everything works.
6) From this point, the body of the POST is never read from the client, and 
there is nothing to be proxy'd to the origin server, and both the consumer and 
producer tasks are never scheduled to run again -- or until the client times 
out.  I tried setting the client-side time out to as long as 3-5 minutes and TS 
really does not recover by itself until the client closed the connection.

This is the first time I uses this bug system.  Please let me know how I could 
produce the configuration files and trace logs, etc.  Thanks!


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

2011-12-12 Thread Wilson Ho (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilson Ho updated TS-1049:
--

Attachment: records.config

> TS hangs (dead lock) on HTTPS POST requests
> ---
>
> Key: TS-1049
> URL: https://issues.apache.org/jira/browse/TS-1049
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core, HTTP, SSL
>Affects Versions: 3.1.1, 3.1.0, 3.0.2
> Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>Reporter: Wilson Ho
>Priority: Blocker
> Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never 
> forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to 
> the backend/origin server via HTTP.  TS process the HTTP headers and 
> establishes connection to the origin server, but the body of the HTTPS POST 
> is never read.  This hangs until the client times out and shuts down the 
> connection.
> To reproduce:
> 1) Client connects to TS using HTTPS (works OK if it is just HTTP).
> 2) It must be a POST request.
> 3) TS must use at least 2 worker threads.
> 4) Easier to reproduce when the connections to the origin server is HTTP (not 
> HTTPS).
> 5) POST body must be large enough so that the HTTP request headers and POST 
> body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> 6) I can consistently reproduce this problem using 2 separate clients each 
> simultaneously submitting 2 requests back to back (i.e., 2 requests from each 
> client, a total of 4 requests).  This gives you a high probability that at 
> least one of the requests would hang.
> Observation:
> 1) Thread A accepted and processed the HTTP headers, and called 
> "UnixNetProcessor::connect_re" to prepare a new connection to the origin 
> server.
> 2) Thread A must not have read the body of the POST.  Otherwise, it works 
> fine.
> 3) Thread B was assigned the task to handle the origin server connection.  If 
> the same thread A was picked, then everything works fine.
> 4) Apparently, one of the first things that thread B does is to acquire the 
> mutex for reading from the client.  (Why does it do that??)
> 5) While thread B was holding the mutex, thread A proceeded in 
> "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  
> Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, 
> but gave up after the second failure. But if thread B released the mutex soon 
> enough, that thread A could proceed happily and everything works.
> 6) From this point, the body of the POST is never read from the client, and 
> there is nothing to be proxy'd to the origin server, and both the consumer 
> and producer tasks are never scheduled to run again -- or until the client 
> times out.  I tried setting the client-side time out to as long as 3-5 
> minutes and TS really does not recover by itself until the client closed the 
> connection.
> This is the first time I uses this bug system.  Please let me know how I 
> could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

2011-12-12 Thread Wilson Ho (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168096#comment-13168096
 ] 

Wilson Ho commented on TS-1049:
---

Adding a call to "readReschdule()" seem to make the problem go away.  But I 
have no idea if this is the right thing to do at all, or if there is a better 
way.  Please advice!

In file SSLNetVConnection.cc:

void
SSLNetVConnection::net_read_io(NetHandler * nh, EThread * lthread)
{
  int ret;
  int64_t r = 0;
  int64_t bytes = 0;
  NetState *s = &this->read;
  MIOBufferAccessor & buf = s->vio.buffer;

  MUTEX_TRY_LOCK_FOR(lock, s->vio.mutex, lthread, s->vio._cont);
  if (!lock) {
  readReschedule(nh);  //  added
  return;
  }


> TS hangs (dead lock) on HTTPS POST requests
> ---
>
> Key: TS-1049
> URL: https://issues.apache.org/jira/browse/TS-1049
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core, HTTP, SSL
>Affects Versions: 3.1.1, 3.1.0, 3.0.2
> Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>Reporter: Wilson Ho
>Priority: Blocker
> Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never 
> forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to 
> the backend/origin server via HTTP.  TS process the HTTP headers and 
> establishes connection to the origin server, but the body of the HTTPS POST 
> is never read.  This hangs until the client times out and shuts down the 
> connection.
> To reproduce:
> 1) Client connects to TS using HTTPS (works OK if it is just HTTP).
> 2) It must be a POST request.
> 3) TS must use at least 2 worker threads.
> 4) Easier to reproduce when the connections to the origin server is HTTP (not 
> HTTPS).
> 5) POST body must be large enough so that the HTTP request headers and POST 
> body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> 6) I can consistently reproduce this problem using 2 separate clients each 
> simultaneously submitting 2 requests back to back (i.e., 2 requests from each 
> client, a total of 4 requests).  This gives you a high probability that at 
> least one of the requests would hang.
> Observation:
> 1) Thread A accepted and processed the HTTP headers, and called 
> "UnixNetProcessor::connect_re" to prepare a new connection to the origin 
> server.
> 2) Thread A must not have read the body of the POST.  Otherwise, it works 
> fine.
> 3) Thread B was assigned the task to handle the origin server connection.  If 
> the same thread A was picked, then everything works fine.
> 4) Apparently, one of the first things that thread B does is to acquire the 
> mutex for reading from the client.  (Why does it do that??)
> 5) While thread B was holding the mutex, thread A proceeded in 
> "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  
> Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, 
> but gave up after the second failure. But if thread B released the mutex soon 
> enough, that thread A could proceed happily and everything works.
> 6) From this point, the body of the POST is never read from the client, and 
> there is nothing to be proxy'd to the origin server, and both the consumer 
> and producer tasks are never scheduled to run again -- or until the client 
> times out.  I tried setting the client-side time out to as long as 3-5 
> minutes and TS really does not recover by itself until the client closed the 
> connection.
> This is the first time I uses this bug system.  Please let me know how I 
> could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira