On Tuesday, 19 April 2016 23:21:42 UTC+10, Sayth Renshaw wrote: > On Tuesday, 19 April 2016 18:17:02 UTC+10, Peter Otten wrote: > > Steven D'Aprano wrote: > > > > > On Tue, 19 Apr 2016 09:44 am, Sayth Renshaw wrote: > > > > > >> Hi > > >> > > >> Why would it be that my files are not being found in this script? > > > > > > You are calling the script with: > > > > > > python jqxml.py samples *.xml > > > > > > This does not do what you think it does: under Linux shells, the glob > > > *.xml will be expanded by the shell. Fortunately, in your case, you have > > > no files in the current directory matching the glob *.xml, so it is not > > > expanded and the arguments your script receives are: > > > > > > > > > "python jqxml.py" # not used > > > > > > "samples" # dir > > > > > > "*.xml" # mask > > > > > > > > > You then call: > > > > > > fileResult = filter(lambda x: x.endswith(mask), files) > > > > > > which looks for file names which end with a literal string (asterisk, dot, > > > x, m, l) in that order. You have no files that match that string. > > > > > > At the shell prompt, enter this: > > > > > > touch samples/junk\*.xml > > > > > > and run the script again, and you should see that it now matches one file. > > > > > > Instead, what you should do is: > > > > > > > > > (1) Use the glob module: > > > > > > https://docs.python.org/2/library/glob.html > > > https://docs.python.org/3/library/glob.html > > > > > > https://pymotw.com/2/glob/ > > > https://pymotw.com/3/glob/ > > > > > > > > > (2) When calling the script, avoid the shell expanding wildcards by > > > escaping them or quoting them: > > > > > > python jqxml.py samples "*.xml" > > > > (3) *Use* the expansion mechanism provided by the shell instead of fighting > > it: > > > > $ python jqxml.py samples/*.xml > > > > This requires that you change your script > > > > from pyquery import PyQuery as pq > > import pandas as pd > > import sys > > > > fileResult = sys.argv[1:] > > > > if not fileResult: > > print("no files specified") > > sys.exit(1) > > > > for file in fileResult: > > print(file) > > > > for items in fileResult: > > try: > > d = pq(filename=items) > > except FileNotFoundError as e: > > print(e) > > continue > > res = d('nomination') > > # you could move the attrs definition before the loop > > attrs = ('id', 'horse') > > # probably a bug: you are overwriting data on every iteration > > data = [[res.eq(i).attr(x) for x in attrs] for i in range(len(res))] > > > > I think this is the most natural approach if you are willing to accept the > > quirk that the script tries to process the file 'samples/*.xml' if the > > samples directory doesn't contain any files with the .xml suffix. Common > > shell tools work that way: > > > > $ ls samples/*.xml > > samples/1.xml samples/2.xml samples/3.xml > > $ ls samples/*.XML > > ls: cannot access samples/*.XML: No such file or directory > > > > Unrelated: instead of working with sys.argv directly you could use argparse > > which is part of the standard library. The code to get at least one file is > > > > import argparse > > > > parser = argparse.ArgumentParser() > > parser.add_argument("files", nargs="+") > > args = parser.parse_args() > > > > print(args.files) > > > > Note that this doesn't fix the shell expansion oddity. > > Hi > > Thanks for the insight, after doing a little reading I found this post which > uses both argparse and glob and attempts to cover the windows and bash > expansion of wildcards, > http://breathmintsforpenguins.blogspot.com.au/2013/09/python-crossplatform-handling-of.html > > import argparse > from glob import glob > > def main(file_names): > print file_names > > if __name__ == "__main__": > parser = argparse.ArgumentParser() > parser.add_argument("file_names", nargs='*') > #nargs='*' tells it to combine all positional arguments into a single > list > args = parser.parse_args() > file_names = list() > > #go through all of the arguments and replace ones with wildcards with the > expansion > #if a string does not contain a wildcard, glob will return it as is. > for arg in args.file_names: > file_names += glob(arg) > > main(file_names) > > And way beyond my needs for such a tiny script but I think tis is the flask > developers python cli creation package Click > http://click.pocoo.org/5/why/#why-not-argparse based of optparse. > > > > # probably a bug: you are overwriting data on every iteration > > data = [[res.eq(i).attr(x) for x in attrs] for i in range(len(res))] > > Thanks for picking this up will have to append to it on each iteration for > each attribute. > > Thank You > > Sayth
Scratch that bit about the code for http://breathmintsforpenguins.blogspot.com.au/2013/09/python-crossplatform-handling-of.html can't get it to work, good general direction though Sayth -- https://mail.python.org/mailman/listinfo/python-list