Hi Steven. Thank you for your detailed response. The code will be
executed on a web server with limited memory so the desire to keep file
loading in check. I like the approach you have suggested to score to
give the best guess. It leaves it fairly modular in respect to how
detailed you want
On Fri, 23 Sep 2005 01:20:49 -0300, David Pratt wrote:
Hi. I have files that I will be importing in at least four different
plain text formats, one of them being tab delimited format, a couple
being token based uses pipes (but not delimited with pipes), another
being xml. There will likely
Thanks Mike for your reply. I am not aware of libmagic and will look
to see what it provides. As far as your first suggestion, this is what
I have been looking at - probably a combination regex and readlines or
similar but trying to get a better sense of best sort of approach more
or less.
David I realize CSV module has a sniffer but it is something that is
David limited more or less to delimited files.
Sure. How about:
def sniff(fname):
if open(fname).read(4) == xml:
return xml
else:
# assume csv - use its sniffer to
David Pratt [EMAIL PROTECTED] writes:
Thanks Mike for your reply. I am not aware of libmagic and will look
to see what it provides.
and ...
Skip Montanaro [EMAIL PROTECTED] writes:
You can also run the file(1) command and see what it says. I seem
to recall someone asking about the
Hi Skip. Thank you for your reply. This is helpful and I am glad I put
this to the list. There are some really good ideas that will help me
come up with something good to use.
Regards,
David
On Friday, September 23, 2005, at 11:14 AM, [EMAIL PROTECTED] wrote:
David I realize CSV module
Thanks Mike this is really great!
Regards,
David
On Friday, September 23, 2005, at 11:55 AM, Mike Meyer wrote:
David Pratt [EMAIL PROTECTED] writes:
Thanks Mike for your reply. I am not aware of libmagic and will look
to see what it provides.
and ...
Skip Montanaro [EMAIL PROTECTED]
Hi. I have files that I will be importing in at least four different
plain text formats, one of them being tab delimited format, a couple
being token based uses pipes (but not delimited with pipes), another
being xml. There will likely be others as well but the data needs to be
extracted and
David Pratt [EMAIL PROTECTED] writes:
Hi. I have files that I will be importing in at least four different
plain text formats, one of them being tab delimited format, a couple
being token based uses pipes (but not delimited with pipes), another
being xml. There will likely be others as well