Hello,

I'm a newbie to Python.  I wrote a Python script which connect to my 
Geodatabase (ESRI ArcGIS File Geodatabase), retrieves the records, then 
proceeds to evaluate which ones are duplicated.  I do this using lists.  
Someone suggested I use arrays instead.  Below is the content of my script.  
Anyone have any ideas on how an array can improve performance?  Right now the 
script takes 2.5 minutes to run on a recordset of 79k+ records:

from __future__ import division
import sys, string, os, arcgisscripting, time
from time import localtime, strftime

def writeMessage(myMsg):
    print myMsg
    global log
    log = open(logFile, 'a')
    log.write(myMsg + "\n")

logFile = "c:\\temp\\" + str(strftime("%Y%m%d %H%M%S", localtime())) + ".log"

writeMessage(' ')
writeMessage(str(strftime("%H:%M:%S", localtime())) + ' begin unique values 
test')

# Create the Geoprocessor object
gp = arcgisscripting.create(9.3)
oid_list = []
dup_list = []
tmp_list = []
myWrkspc = "c:\\temp\\TVM Geodatabase GDIschema v6.0.2 PilotData.gdb"
myFtrCls = "\\Landbase\\T_GroundContour"

writeMessage(' ')
writeMessage('gdb: ' + myWrkspc)
writeMessage('ftr: ' + myFtrCls)
writeMessage(' ')
writeMessage(str(strftime("%H:%M:%S", localtime())) + ' retrieving 
recordset...')

rows = gp.SearchCursor(myWrkspc + myFtrCls,"","","GDI_OID")
row = rows.Next()
writeMessage(' ')
writeMessage(str(strftime("%H:%M:%S", localtime())) + ' processing 
recordset...')
while row:
    if row.GDI_OID in oid_list:
        tmp_list.append(row.GDI_OID)
    oid_list.append(row.GDI_OID)
    row = rows.Next()

writeMessage(' ')
writeMessage(str(strftime("%H:%M:%S", localtime())) + ' generating 
statistics...')

dup_count = len(tmp_list)
tmp_list = list(set(tmp_list))
tmp_list.sort()

for oid in tmp_list:
    a = str(oid) + '     '
    while len(a) < 20:
        a = a + ' '
    dup_list.append(a + '(' + str(oid_list.count(oid)) + ')')

for dup in dup_list:
    writeMessage(dup)

writeMessage(' ')
writeMessage('records    : ' + str(len(oid_list)))
writeMessage('duplicates : ' + str(dup_count))
writeMessage('% errors   : ' + str(round(dup_count / len(oid_list), 4)))
writeMessage(' ')
writeMessage(str(strftime("%H:%M:%S", localtime())) + ' unique values test 
complete')

log.close()
del dup, dup_count, dup_list, gp, log, logFile, myFtrCls, myWrkspc
del oid, oid_list, row, rows, tmp_list
exit()


Thanks!

Paul J. Scipione
GIS Database Administrator
work: 602-371-7091
cell: 480-980-4721



Email Firewall made the following annotations

---------------------------------------------------------------------
--- NOTICE ---

This message is for the designated recipient only and may contain confidential, 
privileged or proprietary information.  If you have received it in error, 
please notify the sender immediately and delete the original and any copy or 
printout.  Unintended recipients are prohibited from making any other use of 
this e-mail.  Although we have taken reasonable precautions to ensure no 
viruses are present in this e-mail, we accept no liability for any loss or 
damage arising from the use of this e-mail or attachments, or for any delay or 
errors or omissions in the contents which result from e-mail transmission.

---------------------------------------------------------------------

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to