Hello,
I'm a newbie to Python. I wrote a Python script which connect to my
Geodatabase (ESRI ArcGIS File Geodatabase), retrieves the records, then
proceeds to evaluate which ones are duplicated. I do this using lists.
Someone suggested I use arrays instead. Below is the content of my script.
Anyone have any ideas on how an array can improve performance? Right now the
script takes 2.5 minutes to run on a recordset of 79k+ records:
from __future__ import division
import sys, string, os, arcgisscripting, time
from time import localtime, strftime
def writeMessage(myMsg):
print myMsg
global log
log = open(logFile, 'a')
log.write(myMsg + "\n")
logFile = "c:\\temp\\" + str(strftime("%Y%m%d %H%M%S", localtime())) + ".log"
writeMessage(' ')
writeMessage(str(strftime("%H:%M:%S", localtime())) + ' begin unique values
test')
# Create the Geoprocessor object
gp = arcgisscripting.create(9.3)
oid_list = []
dup_list = []
tmp_list = []
myWrkspc = "c:\\temp\\TVM Geodatabase GDIschema v6.0.2 PilotData.gdb"
myFtrCls = "\\Landbase\\T_GroundContour"
writeMessage(' ')
writeMessage('gdb: ' + myWrkspc)
writeMessage('ftr: ' + myFtrCls)
writeMessage(' ')
writeMessage(str(strftime("%H:%M:%S", localtime())) + ' retrieving
recordset...')
rows = gp.SearchCursor(myWrkspc + myFtrCls,"","","GDI_OID")
row = rows.Next()
writeMessage(' ')
writeMessage(str(strftime("%H:%M:%S", localtime())) + ' processing
recordset...')
while row:
if row.GDI_OID in oid_list:
tmp_list.append(row.GDI_OID)
oid_list.append(row.GDI_OID)
row = rows.Next()
writeMessage(' ')
writeMessage(str(strftime("%H:%M:%S", localtime())) + ' generating
statistics...')
dup_count = len(tmp_list)
tmp_list = list(set(tmp_list))
tmp_list.sort()
for oid in tmp_list:
a = str(oid) + ' '
while len(a) < 20:
a = a + ' '
dup_list.append(a + '(' + str(oid_list.count(oid)) + ')')
for dup in dup_list:
writeMessage(dup)
writeMessage(' ')
writeMessage('records : ' + str(len(oid_list)))
writeMessage('duplicates : ' + str(dup_count))
writeMessage('% errors : ' + str(round(dup_count / len(oid_list), 4)))
writeMessage(' ')
writeMessage(str(strftime("%H:%M:%S", localtime())) + ' unique values test
complete')
log.close()
del dup, dup_count, dup_list, gp, log, logFile, myFtrCls, myWrkspc
del oid, oid_list, row, rows, tmp_list
exit()
Thanks!
Paul J. Scipione
GIS Database Administrator
work: 602-371-7091
cell: 480-980-4721
Email Firewall made the following annotations
---------------------------------------------------------------------
--- NOTICE ---
This message is for the designated recipient only and may contain confidential,
privileged or proprietary information. If you have received it in error,
please notify the sender immediately and delete the original and any copy or
printout. Unintended recipients are prohibited from making any other use of
this e-mail. Although we have taken reasonable precautions to ensure no
viruses are present in this e-mail, we accept no liability for any loss or
damage arising from the use of this e-mail or attachments, or for any delay or
errors or omissions in the contents which result from e-mail transmission.
---------------------------------------------------------------------
--
http://mail.python.org/mailman/listinfo/python-list