Hello, I'm a newbie to Python. I wrote a Python script which connect to my Geodatabase (ESRI ArcGIS File Geodatabase), retrieves the records, then proceeds to evaluate which ones are duplicated. I do this using lists. Someone suggested I use arrays instead. Below is the content of my script. Anyone have any ideas on how an array can improve performance? Right now the script takes 2.5 minutes to run on a recordset of 79k+ records:
from __future__ import division import sys, string, os, arcgisscripting, time from time import localtime, strftime def writeMessage(myMsg): print myMsg global log log = open(logFile, 'a') log.write(myMsg + "\n") logFile = "c:\\temp\\" + str(strftime("%Y%m%d %H%M%S", localtime())) + ".log" writeMessage(' ') writeMessage(str(strftime("%H:%M:%S", localtime())) + ' begin unique values test') # Create the Geoprocessor object gp = arcgisscripting.create(9.3) oid_list = [] dup_list = [] tmp_list = [] myWrkspc = "c:\\temp\\TVM Geodatabase GDIschema v6.0.2 PilotData.gdb" myFtrCls = "\\Landbase\\T_GroundContour" writeMessage(' ') writeMessage('gdb: ' + myWrkspc) writeMessage('ftr: ' + myFtrCls) writeMessage(' ') writeMessage(str(strftime("%H:%M:%S", localtime())) + ' retrieving recordset...') rows = gp.SearchCursor(myWrkspc + myFtrCls,"","","GDI_OID") row = rows.Next() writeMessage(' ') writeMessage(str(strftime("%H:%M:%S", localtime())) + ' processing recordset...') while row: if row.GDI_OID in oid_list: tmp_list.append(row.GDI_OID) oid_list.append(row.GDI_OID) row = rows.Next() writeMessage(' ') writeMessage(str(strftime("%H:%M:%S", localtime())) + ' generating statistics...') dup_count = len(tmp_list) tmp_list = list(set(tmp_list)) tmp_list.sort() for oid in tmp_list: a = str(oid) + ' ' while len(a) < 20: a = a + ' ' dup_list.append(a + '(' + str(oid_list.count(oid)) + ')') for dup in dup_list: writeMessage(dup) writeMessage(' ') writeMessage('records : ' + str(len(oid_list))) writeMessage('duplicates : ' + str(dup_count)) writeMessage('% errors : ' + str(round(dup_count / len(oid_list), 4))) writeMessage(' ') writeMessage(str(strftime("%H:%M:%S", localtime())) + ' unique values test complete') log.close() del dup, dup_count, dup_list, gp, log, logFile, myFtrCls, myWrkspc del oid, oid_list, row, rows, tmp_list exit() Thanks! Paul J. Scipione GIS Database Administrator work: 602-371-7091 cell: 480-980-4721 Email Firewall made the following annotations --------------------------------------------------------------------- --- NOTICE --- This message is for the designated recipient only and may contain confidential, privileged or proprietary information. If you have received it in error, please notify the sender immediately and delete the original and any copy or printout. Unintended recipients are prohibited from making any other use of this e-mail. Although we have taken reasonable precautions to ensure no viruses are present in this e-mail, we accept no liability for any loss or damage arising from the use of this e-mail or attachments, or for any delay or errors or omissions in the contents which result from e-mail transmission. ---------------------------------------------------------------------
-- http://mail.python.org/mailman/listinfo/python-list