[
https://bro-tracker.atlassian.net/browse/BIT-1235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17700#comment-17700
]
Brian O'Berry edited comment on BIT-1235 at 8/25/14 8:53 AM:
-------------------------------------------------------------
Okay, the patch is incorrect because it assumes only a single file in the
multipart form, which I think means the solution has to be addressed in tcp
and/or mime parsing code. Could it be as simple as changing the if statements
in analyzer::tcp::Content_Analyzer::DoDeliverOnce() to be equality tests rather
than bitwise ANDs (ContentLine.cc lines 242 and 256)? There's also this code to
think about in DoDeliver() starting at line 141:
{code}
if ( (CR_LF_as_EOL & CR_as_EOL) &&
last_char == '\r' && *data == '\n' )
{
// CR is already considered as EOL.
// Compress CRLF to just one line termination.
//
// Note, we test this prior to checking for
// "plain delivery" because (1) we might have
// made the decision to switch to plain delivery
// based on a line terminated with '\r' for
// which a '\n' then arrived, and (2) we are
// careful when executing plain delivery to
// clear last_char once we do so.
last_char = *data;
--len; ++data; ++seq;
++seq_delivered_in_lines;
}
{code}
UPDATE:
Removing the previous patch and changing if stmts in
Content_Analyzer::DoDeliverOnce() as described, to use '==' rather than '&',
*almost* produced a correct file. Bare CR and LF chars were left alone, but
the trailing CRLF (before the multipart boundary string) was included in the
file. Also, I see 2 messages in weird.log, reporting
line_terminated_with_single_LF and line_terminated_with_single_CR, in that
order. The PDF ends with {{%%EOF<LF>}}, and given the order of the weird
messages, I have to wonder if the parsing code is getting confused by the final
LFCRLF sequence preceding the closing multipart boundary.
was (Author: boberry):
Okay, the patch is incorrect because it assumes only a single file in the
multipart form, which I think means the solution has to be addressed in tcp
and/or mime parsing code. Could it be as simple as changing the if statements
in analyzer::tcp::Content_Analyzer::DoDeliverOnce() to be equality tests rather
than bitwise ANDs (ContentLine.cc lines 242 and 256)? There's also this code to
think about in DoDeliver() starting at line 141:
{code}
if ( (CR_LF_as_EOL & CR_as_EOL) &&
last_char == '\r' && *data == '\n' )
{
// CR is already considered as EOL.
// Compress CRLF to just one line termination.
//
// Note, we test this prior to checking for
// "plain delivery" because (1) we might have
// made the decision to switch to plain delivery
// based on a line terminated with '\r' for
// which a '\n' then arrived, and (2) we are
// careful when executing plain delivery to
// clear last_char once we do so.
last_char = *data;
--len; ++data; ++seq;
++seq_delivered_in_lines;
}
{code}
> HTTP multipart POST request alters file contents
> ------------------------------------------------
>
> Key: BIT-1235
> URL: https://bro-tracker.atlassian.net/browse/BIT-1235
> Project: Bro Issue Tracker
> Issue Type: Problem
> Components: Bro
> Affects Versions: 2.3
> Environment: CentOS 6.5, file extract analyzer
> Reporter: Brian O'Berry
> Attachments: bro-2.3-HTTP.patch, gdb.log, upload-api-http.pcap
>
>
> HTTP POST multipart processing converts bare CR or LF chars to CRLF pairs,
> corrupting most files when extracted with Files::ANALYZER_EXTRACT. This is
> clear in the attached gdb.log, which has a backtrace that shows a buffer with
> the start of a PDF file entering MIME/HTTP entity processing at frame 25, and
> emerging with LF chars converted to CRLF at frame 6.
> Also attached are the pcap file associated with the backtrace, and an initial
> patch that we've barely begun to test. A point of concern with the patch is
> that it changes a weird.log entry from "line_terminated_with_single_CR" to
> "http_no_crlf_in_header_list". It does enable Files::ANALYZER_EXTRACT to
> correctly extract the PDF file from the attached pcap.
> Please let me know if we can provide anything else to help with this.
--
This message was sent by Atlassian JIRA
(v6.4-OD-04-006#64001)
_______________________________________________
bro-dev mailing list
[email protected]
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev