Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
I dont think composite tvbs actually work. or at least they didnt work when we originally wrote the reassembly code. They have been fixed last year. They are working for me in 1.8.x code. -- ---> Dirk Jagdmann > http://cubic.org/~doj -> http://llg.cubic.org ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe
Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
On Apr 18, 2013, at 1:40 PM, Evan Huus wrote: > The reassembly code could then add meta-data to this when > reassembling, and the tvb could lazily refetch the underlying tvbs > using the existing wiretap interface? Lazily *regenerate* the underlying tvbs. The "bottom-level" tvbs would be regenerated by reading packet data from the file. However, there might be "intermediate-level" tvbs that do *not* contain data directly fetched from a file; instead, the data might have been decompressed, or decrypted, or de-encoded from base-64, or otherwise processed. Consider a packet reassembled from 3 lower-leel chunks, the first of which came from a base-64 decoding of still lower-level data reassembled from 5 even lower-level decrypted packets ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe
Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
On 04/18/13 17:13, Evan Huus wrote: On Thu, Apr 18, 2013 at 4:53 PM, Jeff Morriss wrote: Hmmm, the way you mentioned wiretap makes me think/realize that the TVBs should not actually store the real offset (since we'd want that stuff abstracted away by wiretap so the tvbuff code wouldn't have to deal with ugliness like compressed files and so forth). That might make it a bit harder... Unless we backed the TVBs with temporary files (seems wasteful and a pain to clean up after). Not sure I follow this, as I'm not sure what you mean by 'real offset'. TVBs don't currently store any offset into the file itself. And the virtual ones I described wouldn't have to, that information is already in frame_data->file_off. My original (not fully thought through) thought was probably something like: file-backed TVBs would basically have a new type (TVBUFF_FILEBACKED or something) and contain: - a pointer which can hold (malloc'd) memory for the contents of the TVB (for caching) - a file pointer (probably a bad idea) - a file offset If instead the TVB points to the frame_data which has the offset (I didn't realize it did) then there's no need for the tvbuff to have it too. ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe
Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
On Thu, Apr 18, 2013 at 4:59 PM, Anders Broman wrote: > Evan Huus skrev 2013-04-18 22:40: > >> On Thu, Apr 18, 2013 at 3:56 PM, Jeff Morriss >> wrote: >>> >>> On 04/18/13 15:14, Evan Huus wrote: This is a tangential issue that has always confused me. Why do we malloc+memcpy data for reassembly when we already have 'virtual' composite TVBs? Wouldn't it be more efficient (in time and memory) to create a composite TVB for each reassembly and then build the reassembled packet in it? You would never have to copy or allocate any actual packet data... >>> >>> >>> There are a couple of problems with doing that (that I recall): >>> >>> 1) Composite TVBs don't actually work (or didn't work until very >>> recently?). >>> >>> 2) The data behind a TVB goes away as soon as we're done dissecting (and >>> displaying) the packet. That is, the TVB data is overwritten (IIRC) when >>> the next packet is read. >>> >>> I suppose there was never any real reason to try to make reassembly work >>> with composite TVBs: if they're just more malloc()'d memory then why mess >>> with it rather than allocate our own copy of the data? (Well, OK, it >>> would >>> save a data copy, but...) >> >> OK, so then the optimal case would be a tvb implementation that stored >> only frame_data pointers, offsets and lengths... similar but not >> identical to the current composite implementation. >> >> The reassembly code could then add meta-data to this when >> reassembling, and the tvb could lazily refetch the underlying tvbs >> using the existing wiretap interface? If we add some sort of caching >> mechanism so that repeated accesses didn't keep forcing reads of the >> original file then I expect this would be very fast: >> >> - adding fragments to reassembly would be near-instantaneous (just a >> few pointer updates) >> - reassembled tvbs would take minimal memory except when accessed >> (using tvb_get_* or proto_tree_add_*) >> - accessing a reassembled tvb would just be an offset calculation and >> then a wtap read to bring into memory the underlying real packet(s) >> containing the data being requested (assuming they aren't already >> cached) >> >> Thoughts? > > If on top of that small enough files could be mmaped it'd be even faster. Yes although I think this could be done entirely separately in wiretap without touching the reassembly or tvbuff code? It would need a wiretap API change since right now we pass in a buffer to fill and the new code would need to return a buffer pointer instead, but other than that I think it would be fairly unintrusive. ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe
Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
On Thu, Apr 18, 2013 at 4:53 PM, Jeff Morriss wrote: > On 04/18/13 16:40, Evan Huus wrote: >> >> On Thu, Apr 18, 2013 at 3:56 PM, Jeff Morriss >> wrote: >>> >>> On 04/18/13 15:14, Evan Huus wrote: This is a tangential issue that has always confused me. Why do we malloc+memcpy data for reassembly when we already have 'virtual' composite TVBs? Wouldn't it be more efficient (in time and memory) to create a composite TVB for each reassembly and then build the reassembled packet in it? You would never have to copy or allocate any actual packet data... >>> >>> >>> >>> There are a couple of problems with doing that (that I recall): >>> >>> 1) Composite TVBs don't actually work (or didn't work until very >>> recently?). >>> >>> 2) The data behind a TVB goes away as soon as we're done dissecting (and >>> displaying) the packet. That is, the TVB data is overwritten (IIRC) when >>> the next packet is read. >>> >>> I suppose there was never any real reason to try to make reassembly work >>> with composite TVBs: if they're just more malloc()'d memory then why mess >>> with it rather than allocate our own copy of the data? (Well, OK, it >>> would >>> save a data copy, but...) >> >> >> OK, so then the optimal case would be a tvb implementation that stored >> only frame_data pointers, offsets and lengths... similar but not >> identical to the current composite implementation. >> >> The reassembly code could then add meta-data to this when >> reassembling, and the tvb could lazily refetch the underlying tvbs >> using the existing wiretap interface? If we add some sort of caching >> mechanism so that repeated accesses didn't keep forcing reads of the >> original file then I expect this would be very fast: >> >> - adding fragments to reassembly would be near-instantaneous (just a >> few pointer updates) >> - reassembled tvbs would take minimal memory except when accessed >> (using tvb_get_* or proto_tree_add_*) >> - accessing a reassembled tvb would just be an offset calculation and >> then a wtap read to bring into memory the underlying real packet(s) >> containing the data being requested (assuming they aren't already >> cached) >> >> Thoughts? > > > Yep, that sounds about what I was thinking of for file-backed (+composite) > TVBs. :-) Oh, OK, I misunderstood that then. > Basically reassembly would only have to deal with composite TVBs (for the > reassembled data). The TVB layer, using file backing for the real data, > would deal with keeping (and not keeping) the data in memory when it is > needed (caching). Yes. > Hmmm, the way you mentioned wiretap makes me think/realize that the TVBs > should not actually store the real offset (since we'd want that stuff > abstracted away by wiretap so the tvbuff code wouldn't have to deal with > ugliness like compressed files and so forth). That might make it a bit > harder... Unless we backed the TVBs with temporary files (seems wasteful > and a pain to clean up after). Not sure I follow this, as I'm not sure what you mean by 'real offset'. TVBs don't currently store any offset into the file itself. And the virtual ones I described wouldn't have to, that information is already in frame_data->file_off. ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe
Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
Evan Huus skrev 2013-04-18 22:40: On Thu, Apr 18, 2013 at 3:56 PM, Jeff Morriss wrote: On 04/18/13 15:14, Evan Huus wrote: This is a tangential issue that has always confused me. Why do we malloc+memcpy data for reassembly when we already have 'virtual' composite TVBs? Wouldn't it be more efficient (in time and memory) to create a composite TVB for each reassembly and then build the reassembled packet in it? You would never have to copy or allocate any actual packet data... There are a couple of problems with doing that (that I recall): 1) Composite TVBs don't actually work (or didn't work until very recently?). 2) The data behind a TVB goes away as soon as we're done dissecting (and displaying) the packet. That is, the TVB data is overwritten (IIRC) when the next packet is read. I suppose there was never any real reason to try to make reassembly work with composite TVBs: if they're just more malloc()'d memory then why mess with it rather than allocate our own copy of the data? (Well, OK, it would save a data copy, but...) OK, so then the optimal case would be a tvb implementation that stored only frame_data pointers, offsets and lengths... similar but not identical to the current composite implementation. The reassembly code could then add meta-data to this when reassembling, and the tvb could lazily refetch the underlying tvbs using the existing wiretap interface? If we add some sort of caching mechanism so that repeated accesses didn't keep forcing reads of the original file then I expect this would be very fast: - adding fragments to reassembly would be near-instantaneous (just a few pointer updates) - reassembled tvbs would take minimal memory except when accessed (using tvb_get_* or proto_tree_add_*) - accessing a reassembled tvb would just be an offset calculation and then a wtap read to bring into memory the underlying real packet(s) containing the data being requested (assuming they aren't already cached) Thoughts? If on top of that small enough files could be mmaped it'd be even faster. ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe
Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
On 04/18/13 16:40, Evan Huus wrote: On Thu, Apr 18, 2013 at 3:56 PM, Jeff Morriss wrote: On 04/18/13 15:14, Evan Huus wrote: This is a tangential issue that has always confused me. Why do we malloc+memcpy data for reassembly when we already have 'virtual' composite TVBs? Wouldn't it be more efficient (in time and memory) to create a composite TVB for each reassembly and then build the reassembled packet in it? You would never have to copy or allocate any actual packet data... There are a couple of problems with doing that (that I recall): 1) Composite TVBs don't actually work (or didn't work until very recently?). 2) The data behind a TVB goes away as soon as we're done dissecting (and displaying) the packet. That is, the TVB data is overwritten (IIRC) when the next packet is read. I suppose there was never any real reason to try to make reassembly work with composite TVBs: if they're just more malloc()'d memory then why mess with it rather than allocate our own copy of the data? (Well, OK, it would save a data copy, but...) OK, so then the optimal case would be a tvb implementation that stored only frame_data pointers, offsets and lengths... similar but not identical to the current composite implementation. The reassembly code could then add meta-data to this when reassembling, and the tvb could lazily refetch the underlying tvbs using the existing wiretap interface? If we add some sort of caching mechanism so that repeated accesses didn't keep forcing reads of the original file then I expect this would be very fast: - adding fragments to reassembly would be near-instantaneous (just a few pointer updates) - reassembled tvbs would take minimal memory except when accessed (using tvb_get_* or proto_tree_add_*) - accessing a reassembled tvb would just be an offset calculation and then a wtap read to bring into memory the underlying real packet(s) containing the data being requested (assuming they aren't already cached) Thoughts? Yep, that sounds about what I was thinking of for file-backed (+composite) TVBs. :-) Basically reassembly would only have to deal with composite TVBs (for the reassembled data). The TVB layer, using file backing for the real data, would deal with keeping (and not keeping) the data in memory when it is needed (caching). Hmmm, the way you mentioned wiretap makes me think/realize that the TVBs should not actually store the real offset (since we'd want that stuff abstracted away by wiretap so the tvbuff code wouldn't have to deal with ugliness like compressed files and so forth). That might make it a bit harder... Unless we backed the TVBs with temporary files (seems wasteful and a pain to clean up after). ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe
Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
On Thu, Apr 18, 2013 at 3:56 PM, Jeff Morriss wrote: > On 04/18/13 15:14, Evan Huus wrote: >> >> This is a tangential issue that has always confused me. >> >> Why do we malloc+memcpy data for reassembly when we already have >> 'virtual' composite TVBs? >> >> Wouldn't it be more efficient (in time and memory) to create a >> composite TVB for each reassembly and then build the reassembled >> packet in it? You would never have to copy or allocate any actual >> packet data... > > > There are a couple of problems with doing that (that I recall): > > 1) Composite TVBs don't actually work (or didn't work until very recently?). > > 2) The data behind a TVB goes away as soon as we're done dissecting (and > displaying) the packet. That is, the TVB data is overwritten (IIRC) when > the next packet is read. > > I suppose there was never any real reason to try to make reassembly work > with composite TVBs: if they're just more malloc()'d memory then why mess > with it rather than allocate our own copy of the data? (Well, OK, it would > save a data copy, but...) OK, so then the optimal case would be a tvb implementation that stored only frame_data pointers, offsets and lengths... similar but not identical to the current composite implementation. The reassembly code could then add meta-data to this when reassembling, and the tvb could lazily refetch the underlying tvbs using the existing wiretap interface? If we add some sort of caching mechanism so that repeated accesses didn't keep forcing reads of the original file then I expect this would be very fast: - adding fragments to reassembly would be near-instantaneous (just a few pointer updates) - reassembled tvbs would take minimal memory except when accessed (using tvb_get_* or proto_tree_add_*) - accessing a reassembled tvb would just be an offset calculation and then a wtap read to bring into memory the underlying real packet(s) containing the data being requested (assuming they aren't already cached) Thoughts? ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe
Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
I dont think composite tvbs actually work. or at least they didnt work when we originally wrote the reassembly code. On Thu, Apr 18, 2013 at 12:14 PM, Evan Huus wrote: > This is a tangential issue that has always confused me. > > Why do we malloc+memcpy data for reassembly when we already have > 'virtual' composite TVBs? > > Wouldn't it be more efficient (in time and memory) to create a > composite TVB for each reassembly and then build the reassembled > packet in it? You would never have to copy or allocate any actual > packet data... > > Or am I misunderstanding how composite TVBs actually work? > > Thanks, > Evan > > P.S. Clearly some protocols where the payload is XORed or hashed > wouldn't be able to do this, but most protocols doing reassembly just > carry the raw payload bytes. > ___ > Sent via:Wireshark-dev mailing list > Archives:http://www.wireshark.org/lists/wireshark-dev > Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev > mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe
Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
On 04/18/13 15:14, Evan Huus wrote: This is a tangential issue that has always confused me. Why do we malloc+memcpy data for reassembly when we already have 'virtual' composite TVBs? Wouldn't it be more efficient (in time and memory) to create a composite TVB for each reassembly and then build the reassembled packet in it? You would never have to copy or allocate any actual packet data... There are a couple of problems with doing that (that I recall): 1) Composite TVBs don't actually work (or didn't work until very recently?). 2) The data behind a TVB goes away as soon as we're done dissecting (and displaying) the packet. That is, the TVB data is overwritten (IIRC) when the next packet is read. I suppose there was never any real reason to try to make reassembly work with composite TVBs: if they're just more malloc()'d memory then why mess with it rather than allocate our own copy of the data? (Well, OK, it would save a data copy, but...) ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe
[Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
This is a tangential issue that has always confused me. Why do we malloc+memcpy data for reassembly when we already have 'virtual' composite TVBs? Wouldn't it be more efficient (in time and memory) to create a composite TVB for each reassembly and then build the reassembled packet in it? You would never have to copy or allocate any actual packet data... Or am I misunderstanding how composite TVBs actually work? Thanks, Evan P.S. Clearly some protocols where the payload is XORed or hashed wouldn't be able to do this, but most protocols doing reassembly just carry the raw payload bytes. ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe