Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]

2013-04-18 Thread Dirk Jagdmann

I dont think composite tvbs actually work.   or at least they didnt
work when we originally wrote the reassembly code.


They have been fixed last year. They are working for me in 1.8.x code.

--
---> Dirk Jagdmann
> http://cubic.org/~doj
-> http://llg.cubic.org
___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe


Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]

2013-04-18 Thread Guy Harris

On Apr 18, 2013, at 1:40 PM, Evan Huus  wrote:

> The reassembly code could then add meta-data to this when
> reassembling, and the tvb could lazily refetch the underlying tvbs
> using the existing wiretap interface?

Lazily *regenerate* the underlying tvbs.

The "bottom-level" tvbs would be regenerated by reading packet data from the 
file.

However, there might be "intermediate-level" tvbs that do *not* contain data 
directly fetched from a file; instead, the data might have been decompressed, 
or decrypted, or de-encoded from base-64, or otherwise processed.  Consider a 
packet reassembled from 3 lower-leel chunks, the first of which came from a 
base-64 decoding of still lower-level data reassembled from 5 even lower-level 
decrypted packets

___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
 mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe


Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]

2013-04-18 Thread Jeff Morriss

On 04/18/13 17:13, Evan Huus wrote:

On Thu, Apr 18, 2013 at 4:53 PM, Jeff Morriss  wrote:

Hmmm, the way you mentioned wiretap makes me think/realize that the TVBs
should not actually store the real offset (since we'd want that stuff
abstracted away by wiretap so the tvbuff code wouldn't have to deal with
ugliness like compressed files and so forth).  That might make it a bit
harder...  Unless we backed the TVBs with temporary files (seems wasteful
and a pain to clean up after).


Not sure I follow this, as I'm not sure what you mean by 'real
offset'. TVBs don't currently store any offset into the file itself.
And the virtual ones I described wouldn't have to, that information is
already in frame_data->file_off.


My original (not fully thought through) thought was probably something 
like: file-backed TVBs would basically have a new type 
(TVBUFF_FILEBACKED or something) and contain:
 - a pointer which can hold (malloc'd) memory for the contents of the 
TVB (for caching)

 - a file pointer (probably a bad idea)
 - a file offset

If instead the TVB points to the frame_data which has the offset (I 
didn't realize it did) then there's no need for the tvbuff to have it too.


___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe


Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]

2013-04-18 Thread Evan Huus
On Thu, Apr 18, 2013 at 4:59 PM, Anders Broman  wrote:
> Evan Huus skrev 2013-04-18 22:40:
>
>> On Thu, Apr 18, 2013 at 3:56 PM, Jeff Morriss 
>> wrote:
>>>
>>> On 04/18/13 15:14, Evan Huus wrote:

 This is a tangential issue that has always confused me.

 Why do we malloc+memcpy data for reassembly when we already have
 'virtual' composite TVBs?

 Wouldn't it be more efficient (in time and memory) to create a
 composite TVB for each reassembly and then build the reassembled
 packet in it? You would never have to copy or allocate any actual
 packet data...
>>>
>>>
>>> There are a couple of problems with doing that (that I recall):
>>>
>>> 1) Composite TVBs don't actually work (or didn't work until very
>>> recently?).
>>>
>>> 2) The data behind a TVB goes away as soon as we're done dissecting (and
>>> displaying) the packet.  That is, the TVB data is overwritten (IIRC) when
>>> the next packet is read.
>>>
>>> I suppose there was never any real reason to try to make reassembly work
>>> with composite TVBs: if they're just more malloc()'d memory then why mess
>>> with it rather than allocate our own copy of the data?  (Well, OK, it
>>> would
>>> save a data copy, but...)
>>
>> OK, so then the optimal case would be a tvb implementation that stored
>> only frame_data pointers, offsets and lengths... similar but not
>> identical to the current composite implementation.
>>
>> The reassembly code could then add meta-data to this when
>> reassembling, and the tvb could lazily refetch the underlying tvbs
>> using the existing wiretap interface? If we add some sort of caching
>> mechanism so that repeated accesses didn't keep forcing reads of the
>> original file then I expect this would be very fast:
>>
>> - adding fragments to reassembly would be near-instantaneous (just a
>> few pointer updates)
>> - reassembled tvbs would take minimal memory except when accessed
>> (using tvb_get_* or proto_tree_add_*)
>> - accessing a reassembled tvb would just be an offset calculation and
>> then a wtap read to bring into memory the underlying real packet(s)
>> containing the data being requested (assuming they aren't already
>> cached)
>>
>> Thoughts?
>
> If on top of that small enough files could be mmaped it'd be even faster.

Yes although I think this could be done entirely separately in wiretap
without touching the reassembly or tvbuff code? It would need a
wiretap API change since right now we pass in a buffer to fill and the
new code would need to return a buffer pointer instead, but other than
that I think it would be fairly unintrusive.
___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
 mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe


Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]

2013-04-18 Thread Evan Huus
On Thu, Apr 18, 2013 at 4:53 PM, Jeff Morriss  wrote:
> On 04/18/13 16:40, Evan Huus wrote:
>>
>> On Thu, Apr 18, 2013 at 3:56 PM, Jeff Morriss 
>> wrote:
>>>
>>> On 04/18/13 15:14, Evan Huus wrote:


 This is a tangential issue that has always confused me.

 Why do we malloc+memcpy data for reassembly when we already have
 'virtual' composite TVBs?

 Wouldn't it be more efficient (in time and memory) to create a
 composite TVB for each reassembly and then build the reassembled
 packet in it? You would never have to copy or allocate any actual
 packet data...
>>>
>>>
>>>
>>> There are a couple of problems with doing that (that I recall):
>>>
>>> 1) Composite TVBs don't actually work (or didn't work until very
>>> recently?).
>>>
>>> 2) The data behind a TVB goes away as soon as we're done dissecting (and
>>> displaying) the packet.  That is, the TVB data is overwritten (IIRC) when
>>> the next packet is read.
>>>
>>> I suppose there was never any real reason to try to make reassembly work
>>> with composite TVBs: if they're just more malloc()'d memory then why mess
>>> with it rather than allocate our own copy of the data?  (Well, OK, it
>>> would
>>> save a data copy, but...)
>>
>>
>> OK, so then the optimal case would be a tvb implementation that stored
>> only frame_data pointers, offsets and lengths... similar but not
>> identical to the current composite implementation.
>>
>> The reassembly code could then add meta-data to this when
>> reassembling, and the tvb could lazily refetch the underlying tvbs
>> using the existing wiretap interface? If we add some sort of caching
>> mechanism so that repeated accesses didn't keep forcing reads of the
>> original file then I expect this would be very fast:
>>
>> - adding fragments to reassembly would be near-instantaneous (just a
>> few pointer updates)
>> - reassembled tvbs would take minimal memory except when accessed
>> (using tvb_get_* or proto_tree_add_*)
>> - accessing a reassembled tvb would just be an offset calculation and
>> then a wtap read to bring into memory the underlying real packet(s)
>> containing the data being requested (assuming they aren't already
>> cached)
>>
>> Thoughts?
>
>
> Yep, that sounds about what I was thinking of for file-backed (+composite)
> TVBs. :-)

Oh, OK, I misunderstood that then.

> Basically reassembly would only have to deal with composite TVBs (for the
> reassembled data).  The TVB layer, using file backing for the real data,
> would deal with keeping (and not keeping) the data in memory when it is
> needed (caching).

Yes.

> Hmmm, the way you mentioned wiretap makes me think/realize that the TVBs
> should not actually store the real offset (since we'd want that stuff
> abstracted away by wiretap so the tvbuff code wouldn't have to deal with
> ugliness like compressed files and so forth).  That might make it a bit
> harder...  Unless we backed the TVBs with temporary files (seems wasteful
> and a pain to clean up after).

Not sure I follow this, as I'm not sure what you mean by 'real
offset'. TVBs don't currently store any offset into the file itself.
And the virtual ones I described wouldn't have to, that information is
already in frame_data->file_off.
___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
 mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe


Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]

2013-04-18 Thread Anders Broman

Evan Huus skrev 2013-04-18 22:40:

On Thu, Apr 18, 2013 at 3:56 PM, Jeff Morriss  wrote:

On 04/18/13 15:14, Evan Huus wrote:

This is a tangential issue that has always confused me.

Why do we malloc+memcpy data for reassembly when we already have
'virtual' composite TVBs?

Wouldn't it be more efficient (in time and memory) to create a
composite TVB for each reassembly and then build the reassembled
packet in it? You would never have to copy or allocate any actual
packet data...


There are a couple of problems with doing that (that I recall):

1) Composite TVBs don't actually work (or didn't work until very recently?).

2) The data behind a TVB goes away as soon as we're done dissecting (and
displaying) the packet.  That is, the TVB data is overwritten (IIRC) when
the next packet is read.

I suppose there was never any real reason to try to make reassembly work
with composite TVBs: if they're just more malloc()'d memory then why mess
with it rather than allocate our own copy of the data?  (Well, OK, it would
save a data copy, but...)

OK, so then the optimal case would be a tvb implementation that stored
only frame_data pointers, offsets and lengths... similar but not
identical to the current composite implementation.

The reassembly code could then add meta-data to this when
reassembling, and the tvb could lazily refetch the underlying tvbs
using the existing wiretap interface? If we add some sort of caching
mechanism so that repeated accesses didn't keep forcing reads of the
original file then I expect this would be very fast:

- adding fragments to reassembly would be near-instantaneous (just a
few pointer updates)
- reassembled tvbs would take minimal memory except when accessed
(using tvb_get_* or proto_tree_add_*)
- accessing a reassembled tvb would just be an offset calculation and
then a wtap read to bring into memory the underlying real packet(s)
containing the data being requested (assuming they aren't already
cached)

Thoughts?

If on top of that small enough files could be mmaped it'd be even faster.

___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
  mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe




___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe


Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]

2013-04-18 Thread Jeff Morriss

On 04/18/13 16:40, Evan Huus wrote:

On Thu, Apr 18, 2013 at 3:56 PM, Jeff Morriss  wrote:

On 04/18/13 15:14, Evan Huus wrote:


This is a tangential issue that has always confused me.

Why do we malloc+memcpy data for reassembly when we already have
'virtual' composite TVBs?

Wouldn't it be more efficient (in time and memory) to create a
composite TVB for each reassembly and then build the reassembled
packet in it? You would never have to copy or allocate any actual
packet data...



There are a couple of problems with doing that (that I recall):

1) Composite TVBs don't actually work (or didn't work until very recently?).

2) The data behind a TVB goes away as soon as we're done dissecting (and
displaying) the packet.  That is, the TVB data is overwritten (IIRC) when
the next packet is read.

I suppose there was never any real reason to try to make reassembly work
with composite TVBs: if they're just more malloc()'d memory then why mess
with it rather than allocate our own copy of the data?  (Well, OK, it would
save a data copy, but...)


OK, so then the optimal case would be a tvb implementation that stored
only frame_data pointers, offsets and lengths... similar but not
identical to the current composite implementation.

The reassembly code could then add meta-data to this when
reassembling, and the tvb could lazily refetch the underlying tvbs
using the existing wiretap interface? If we add some sort of caching
mechanism so that repeated accesses didn't keep forcing reads of the
original file then I expect this would be very fast:

- adding fragments to reassembly would be near-instantaneous (just a
few pointer updates)
- reassembled tvbs would take minimal memory except when accessed
(using tvb_get_* or proto_tree_add_*)
- accessing a reassembled tvb would just be an offset calculation and
then a wtap read to bring into memory the underlying real packet(s)
containing the data being requested (assuming they aren't already
cached)

Thoughts?


Yep, that sounds about what I was thinking of for file-backed 
(+composite) TVBs. :-)


Basically reassembly would only have to deal with composite TVBs (for 
the reassembled data).  The TVB layer, using file backing for the real 
data, would deal with keeping (and not keeping) the data in memory when 
it is needed (caching).


Hmmm, the way you mentioned wiretap makes me think/realize that the TVBs 
should not actually store the real offset (since we'd want that stuff 
abstracted away by wiretap so the tvbuff code wouldn't have to deal with 
ugliness like compressed files and so forth).  That might make it a bit 
harder...  Unless we backed the TVBs with temporary files (seems 
wasteful and a pain to clean up after).


___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe


Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]

2013-04-18 Thread Evan Huus
On Thu, Apr 18, 2013 at 3:56 PM, Jeff Morriss  wrote:
> On 04/18/13 15:14, Evan Huus wrote:
>>
>> This is a tangential issue that has always confused me.
>>
>> Why do we malloc+memcpy data for reassembly when we already have
>> 'virtual' composite TVBs?
>>
>> Wouldn't it be more efficient (in time and memory) to create a
>> composite TVB for each reassembly and then build the reassembled
>> packet in it? You would never have to copy or allocate any actual
>> packet data...
>
>
> There are a couple of problems with doing that (that I recall):
>
> 1) Composite TVBs don't actually work (or didn't work until very recently?).
>
> 2) The data behind a TVB goes away as soon as we're done dissecting (and
> displaying) the packet.  That is, the TVB data is overwritten (IIRC) when
> the next packet is read.
>
> I suppose there was never any real reason to try to make reassembly work
> with composite TVBs: if they're just more malloc()'d memory then why mess
> with it rather than allocate our own copy of the data?  (Well, OK, it would
> save a data copy, but...)

OK, so then the optimal case would be a tvb implementation that stored
only frame_data pointers, offsets and lengths... similar but not
identical to the current composite implementation.

The reassembly code could then add meta-data to this when
reassembling, and the tvb could lazily refetch the underlying tvbs
using the existing wiretap interface? If we add some sort of caching
mechanism so that repeated accesses didn't keep forcing reads of the
original file then I expect this would be very fast:

- adding fragments to reassembly would be near-instantaneous (just a
few pointer updates)
- reassembled tvbs would take minimal memory except when accessed
(using tvb_get_* or proto_tree_add_*)
- accessing a reassembled tvb would just be an offset calculation and
then a wtap read to bring into memory the underlying real packet(s)
containing the data being requested (assuming they aren't already
cached)

Thoughts?
___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
 mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe


Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]

2013-04-18 Thread ronnie sahlberg
I dont think composite tvbs actually work.   or at least they didnt
work when we originally wrote the reassembly code.



On Thu, Apr 18, 2013 at 12:14 PM, Evan Huus  wrote:
> This is a tangential issue that has always confused me.
>
> Why do we malloc+memcpy data for reassembly when we already have
> 'virtual' composite TVBs?
>
> Wouldn't it be more efficient (in time and memory) to create a
> composite TVB for each reassembly and then build the reassembled
> packet in it? You would never have to copy or allocate any actual
> packet data...
>
> Or am I misunderstanding how composite TVBs actually work?
>
> Thanks,
> Evan
>
> P.S. Clearly some protocols where the payload is XORed or hashed
> wouldn't be able to do this, but most protocols doing reassembly just
> carry the raw payload bytes.
> ___
> Sent via:Wireshark-dev mailing list 
> Archives:http://www.wireshark.org/lists/wireshark-dev
> Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
>  mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe
___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
 mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe


Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]

2013-04-18 Thread Jeff Morriss

On 04/18/13 15:14, Evan Huus wrote:

This is a tangential issue that has always confused me.

Why do we malloc+memcpy data for reassembly when we already have
'virtual' composite TVBs?

Wouldn't it be more efficient (in time and memory) to create a
composite TVB for each reassembly and then build the reassembled
packet in it? You would never have to copy or allocate any actual
packet data...


There are a couple of problems with doing that (that I recall):

1) Composite TVBs don't actually work (or didn't work until very recently?).

2) The data behind a TVB goes away as soon as we're done dissecting (and 
displaying) the packet.  That is, the TVB data is overwritten (IIRC) 
when the next packet is read.


I suppose there was never any real reason to try to make reassembly work 
with composite TVBs: if they're just more malloc()'d memory then why 
mess with it rather than allocate our own copy of the data?  (Well, OK, 
it would save a data copy, but...)


___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe


[Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]

2013-04-18 Thread Evan Huus
This is a tangential issue that has always confused me.

Why do we malloc+memcpy data for reassembly when we already have
'virtual' composite TVBs?

Wouldn't it be more efficient (in time and memory) to create a
composite TVB for each reassembly and then build the reassembled
packet in it? You would never have to copy or allocate any actual
packet data...

Or am I misunderstanding how composite TVBs actually work?

Thanks,
Evan

P.S. Clearly some protocols where the payload is XORed or hashed
wouldn't be able to do this, but most protocols doing reassembly just
carry the raw payload bytes.
___
Sent via:Wireshark-dev mailing list 
Archives:http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
 mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe