On 24Dec2015 13:54, richard kappler <richkapp...@gmail.com> wrote:
I have to create a script that reads  xml data over a tcp socket, parses it
and outputs it to console. Not so bad, most of which I already know how to
do. I know how to set up the socket, though I am using a file for
development and testing, am using lxml and have created an xslt that does
what I want with the xml, and it outputs it to console.

What I'm not really sure of, each xml 'message' is preceeded by an STX
(\x02) and ends with an ETX (\x03). These 'messages' (Danny, are you noting
I don't say -lines- anymore? :-)  ) need to be parsed and output whole as
opposed to partial.

My concern is, there will actually be numerous machines sending data to the
tcp socket, so it's entirely likely the messages will come in fragmented
and the fragments will need to be held until complete so they can be sent
on whole to the parser. While this is the job of tcp, my script needs to

I think what I need to do would be analogous to (pardon if I'm using the
wrong terminology, at this poing in the discussion I am officially out of
my depth) sending the input stream to a buffer(s) until  the ETX for that
message comes in, shoot the buffer contents to the parser while accepting
the next STX + message fragment into the buffer, or something analogous.

Any guidance here?

Since a TCP stream runs from one machine to another (may be the same machine); presumably your actually have multiple TCP streams to manage, and at the same time as otherwise you could just process one until EOF, then the next and so on. Correct?

My personal inclination would start a Thread for each stream, and have that thread simple read the stream extracting XML chunks, and then .put each chunk on a Queue used by whatever does stuff with the XML (accept chunk, parse, etc). If you need to know where the chunk came from, .put a tuple with the chunk and some context information.

Does that help you move forward?

Cheers,
Cameron Simpson <c...@zip.com.au>
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to