Re: Sniffing Text Files

2005-09-26 Thread David Pratt
Hi Steven. Thank you for your detailed response. The code will be executed on a web server with limited memory so the desire to keep file loading in check. I like the approach you have suggested to score to give the best guess. It leaves it fairly modular in respect to how detailed you want

Re: Sniffing Text Files

2005-09-23 Thread Steven D'Aprano
On Fri, 23 Sep 2005 01:20:49 -0300, David Pratt wrote: Hi. I have files that I will be importing in at least four different plain text formats, one of them being tab delimited format, a couple being token based uses pipes (but not delimited with pipes), another being xml. There will likely

Re: Sniffing Text Files

2005-09-23 Thread David Pratt
Thanks Mike for your reply. I am not aware of libmagic and will look to see what it provides. As far as your first suggestion, this is what I have been looking at - probably a combination regex and readlines or similar but trying to get a better sense of best sort of approach more or less.

Re: Sniffing Text Files

2005-09-23 Thread skip
David I realize CSV module has a sniffer but it is something that is David limited more or less to delimited files. Sure. How about: def sniff(fname): if open(fname).read(4) == xml: return xml else: # assume csv - use its sniffer to

Re: Sniffing Text Files

2005-09-23 Thread Mike Meyer
David Pratt [EMAIL PROTECTED] writes: Thanks Mike for your reply. I am not aware of libmagic and will look to see what it provides. and ... Skip Montanaro [EMAIL PROTECTED] writes: You can also run the file(1) command and see what it says. I seem to recall someone asking about the

Re: Sniffing Text Files

2005-09-23 Thread David Pratt
Hi Skip. Thank you for your reply. This is helpful and I am glad I put this to the list. There are some really good ideas that will help me come up with something good to use. Regards, David On Friday, September 23, 2005, at 11:14 AM, [EMAIL PROTECTED] wrote: David I realize CSV module

Re: Sniffing Text Files

2005-09-23 Thread David Pratt
Thanks Mike this is really great! Regards, David On Friday, September 23, 2005, at 11:55 AM, Mike Meyer wrote: David Pratt [EMAIL PROTECTED] writes: Thanks Mike for your reply. I am not aware of libmagic and will look to see what it provides. and ... Skip Montanaro [EMAIL PROTECTED]

Sniffing Text Files

2005-09-22 Thread David Pratt
Hi. I have files that I will be importing in at least four different plain text formats, one of them being tab delimited format, a couple being token based uses pipes (but not delimited with pipes), another being xml. There will likely be others as well but the data needs to be extracted and

Re: Sniffing Text Files

2005-09-22 Thread Mike Meyer
David Pratt [EMAIL PROTECTED] writes: Hi. I have files that I will be importing in at least four different plain text formats, one of them being tab delimited format, a couple being token based uses pipes (but not delimited with pipes), another being xml. There will likely be others as well