[ 
https://bro-tracker.atlassian.net/browse/BIT-1235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17700#comment-17700
 ] 

Brian O'Berry edited comment on BIT-1235 at 8/25/14 8:53 AM:
-------------------------------------------------------------

Okay, the patch is incorrect because it assumes only a single file in the 
multipart form, which I think means the solution has to be addressed in tcp 
and/or mime parsing code. Could it be as simple as changing the if statements 
in analyzer::tcp::Content_Analyzer::DoDeliverOnce() to be equality tests rather 
than bitwise ANDs (ContentLine.cc lines 242 and 256)? There's also this code to 
think about in DoDeliver() starting at line 141:
{code}
                if ( (CR_LF_as_EOL & CR_as_EOL) &&
                     last_char == '\r' && *data == '\n' )
                        {
                        // CR is already considered as EOL.
                        // Compress CRLF to just one line termination.
                        //
                        // Note, we test this prior to checking for
                        // "plain delivery" because (1) we might have
                        // made the decision to switch to plain delivery
                        // based on a line terminated with '\r' for
                        // which a '\n' then arrived, and (2) we are
                        // careful when executing plain delivery to
                        // clear last_char once we do so.
                        last_char = *data;
                        --len; ++data; ++seq;
                        ++seq_delivered_in_lines;
                        }
{code}

UPDATE:

Removing the previous patch and changing if stmts in 
Content_Analyzer::DoDeliverOnce() as described, to use '==' rather than '&', 
*almost* produced a correct file.  Bare CR and LF chars were left alone, but 
the trailing CRLF (before the multipart boundary string) was included in the 
file.  Also, I see 2 messages in weird.log, reporting 
line_terminated_with_single_LF and line_terminated_with_single_CR, in that 
order.  The PDF ends with {{%%EOF<LF>}}, and given the order of the weird 
messages, I have to wonder if the parsing code is getting confused by the final 
LFCRLF sequence preceding the closing multipart boundary.


was (Author: boberry):
Okay, the patch is incorrect because it assumes only a single file in the 
multipart form, which I think means the solution has to be addressed in tcp 
and/or mime parsing code. Could it be as simple as changing the if statements 
in analyzer::tcp::Content_Analyzer::DoDeliverOnce() to be equality tests rather 
than bitwise ANDs (ContentLine.cc lines 242 and 256)? There's also this code to 
think about in DoDeliver() starting at line 141:
{code}
                if ( (CR_LF_as_EOL & CR_as_EOL) &&
                     last_char == '\r' && *data == '\n' )
                        {
                        // CR is already considered as EOL.
                        // Compress CRLF to just one line termination.
                        //
                        // Note, we test this prior to checking for
                        // "plain delivery" because (1) we might have
                        // made the decision to switch to plain delivery
                        // based on a line terminated with '\r' for
                        // which a '\n' then arrived, and (2) we are
                        // careful when executing plain delivery to
                        // clear last_char once we do so.
                        last_char = *data;
                        --len; ++data; ++seq;
                        ++seq_delivered_in_lines;
                        }
{code}

> HTTP multipart POST request alters file contents
> ------------------------------------------------
>
>                 Key: BIT-1235
>                 URL: https://bro-tracker.atlassian.net/browse/BIT-1235
>             Project: Bro Issue Tracker
>          Issue Type: Problem
>          Components: Bro
>    Affects Versions: 2.3
>         Environment: CentOS 6.5, file extract analyzer
>            Reporter: Brian O'Berry
>         Attachments: bro-2.3-HTTP.patch, gdb.log, upload-api-http.pcap
>
>
> HTTP POST multipart processing converts bare CR or LF chars to CRLF pairs, 
> corrupting most files when extracted with Files::ANALYZER_EXTRACT.  This is 
> clear in the attached gdb.log, which has a backtrace that shows a buffer with 
> the start of a PDF file entering MIME/HTTP entity processing at frame 25, and 
> emerging with LF chars converted to CRLF at frame 6.
> Also attached are the pcap file associated with the backtrace, and an initial 
> patch that we've barely begun to test.  A point of concern with the patch is 
> that it changes a weird.log entry from "line_terminated_with_single_CR" to 
> "http_no_crlf_in_header_list".  It does enable Files::ANALYZER_EXTRACT to 
> correctly extract the PDF file from the attached pcap.
> Please let me know if we can provide anything else to help with this.



--
This message was sent by Atlassian JIRA
(v6.4-OD-04-006#64001)
_______________________________________________
bro-dev mailing list
[email protected]
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev

Reply via email to