On 02/06/15 08:27, Alan Gauld wrote:

The following is a sample of the test code, as well as the url/posts
of the pages as produced by the Firefox/Firebug process.

I'm not really answering your question but addressing some
issues in your code...

execfile('/apps/parseapp2/ascii_strip.py')
execfile('dir_defs_inc.py')

I'm not sure what these do but usually its better to
import the files as modules then execute their
functions directly.

appDir="/apps/parseapp2/"

# data output filename
datafile="unlvDept.dat"


# global var for the parent/child list json
plist={}


cname="unlv.lwp"

#----------------------------------------

if __name__ == "__main__":
# main app

It makes testing (and reuse) easier if you put the main code
in a function called main() and then just call that here.

Also your code could be broken up into smaller functions
which again will make testing and debugging easier.

  #
  # get the input struct, parse it, determine the level
  #

  cmd="echo '' > "+datafile
  proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
  res=proc.communicate()[0].strip()

Its easier and more efficient/reliable to create the
file directly from Python. Calling the subprocess modyule
each time starts up extra processes.

Also you store the result but never use it...

  cmd="echo '' > "+cname
  proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
  res=proc.communicate()[0].strip()

See above


  cmd='curl -vvv  '
  cmd=cmd+'-A  "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.11)
Gecko/2009061118 Fedora/3.0.11-1.fc9 Firefox/3.0.11"'
  cmd=cmd+'   --cookie-jar '+cname+' --cookie '+cname+'    '
  cmd=cmd+'-L "http://www.lonestar.edu/class-search.htm";'

You build up strings like this many times but its very inefficient. There are several better options:
1) create a list of substrings then use join() to convert
   the list to a string.
2) use a triple quoted string to  create the string once only.

And since you are mostly passing them to Popen look at the
docs to see how to pass a list of args instead of one large
string, its more secure and generally better practice.

  cmd='curl -vvv  '
  cmd=cmd+'-A  "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.11)
Gecko/2009061118 Fedora/3.0.11-1.fc9 Firefox/3.0.11"'
  cmd=cmd+'   --cookie-jar '+cname+' --cookie '+cname+'    '
  cmd=cmd+'-L "https://campus.lonestar.edu/classsearch.htm";'

   #initial page
  cmd='curl -vvv  '
  cmd=cmd+'-A  "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.11)
Gecko/2009061118 Fedora/3.0.11-1.fc9 Firefox/3.0.11"'
  cmd=cmd+'   --cookie-jar '+cname+' --cookie '+cname+'    '
  cmd=cmd+'-L
"https://my.unlv.nevada.edu/psc/lvporprd/EMPLOYEE/HRMS/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL";'

  proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
  res2=proc.communicate()[0].strip()

  print res2

  sys.exit()

Since this is non conditional you always exit here so nothing
else ever gets executed. This may be the cause of your problem?

  # s contains HTML not XML text
  d = libxml2dom.parseString(res2, html=1)

  #-----------Form------------

  selpath="//input[@id='ICSID']//attribute::value"

  sel_ = d.xpath(selpath)


  if (len(sel_) == 0):
    sys.exit()

  val=""
  ndx=0
  for a in sel_:
    val=a.textContent.strip()

  print val
  #sys.exit()

  if(val==""):
    sys.exit()


  #build the 1st post

  ddd=1

  post=""

This does nothing since you immediately replace it with the next line.

  post="ICAJAX=1"
  post=post+"&ICAPPCLSDATA="
  post=post+"&ICAction=DERIVED_CLSRCH_SSR_EXPAND_COLLAPS%24149%24%241"
  post=post+"&ICActionPrompt=false"
  post=post+"&ICAddCount="
  post=post+"&ICAutoSave=0"
  post=post+"&ICBcDomData=undefined"
  post=post+"&ICChanged=-1"
  post=post+"&ICElementNum=0"
  post=post+"&ICFind="
  post=post+"&ICFocus="
  post=post+"&ICNAVTYPEDROPDOWN=0"
  post=post+"&ICResubmit=0"
  post=post+"&ICSID="+urllib.quote(val)
  post=post+"&ICSaveWarningFilter=0"
  post=post+"&ICStateNum="+str(ddd)
  post=post+"&ICType=Panel"
  post=post+"&ICXPos=0"
  post=post+"&ICYPos=114"
  post=post+"&ResponsetoDiffFrame=-1"
  post=post+"&SSR_CLSRCH_WRK_SSR_OPEN_ONLY$chk$3=N"
  post=post+"&SSR_CLSRCH_WRK_SUBJECT$0=ACC"
  post=post+"&TargetFrameName=None"

Since these are all hard coded strings you might as well
have just hard coded the final string and saved a lot
of processing. (and code space)

  cmd='curl -vvv  '
  cmd=cmd+'-A  "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.11)
Gecko/2009061118 Fedora/3.0.11-1.fc9 Firefox/3.0.11"'
  cmd=cmd+'   --cookie-jar '+cname+' --cookie '+cname+'    '
  cmd=cmd+'-e
"https://my.unlv.nevada.edu/psc/lvporprd/EMPLOYEE/HRMS/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL?&";

This looks awfully similar to the code up above. Could you have reused the command? Maybe with some parameters - check out string formatting operations. eg: 'This string takes %s as a parameter" % 'a string'

I'll stop here, its all getting  a bit repetitive.
Which is, in itself a sign that you need to create some functions.

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to