Am 01.04.2011 21:31, schrieb Karim:
On 04/01/2011 08:41 PM, Knacktus wrote:
Am 01.04.2011 19:16, schrieb Karim:


Hello,

I would like to get advice on the best practice to count elements in a
list (built from scractch).
The number of elements is in the range 1e3 and 1e6.

1) I could create a generator and set a counter (i +=1) in the loop.

2) or simply len(mylist).

I don't need the content of the list, indeed, in term of memory I don't
want to wast it. But I suppose len() is optimized too (C impementation).

If you have some thought to share don't hesitate.

Just a general suggestion: Provide code examples. I know most of the
times you don't have code examples yet as you're thinking of how to
solve your problems. But if you post one of the possible solutions the
experienced guys here will very likely direct you in the proper
direction. But without code it's hard to understand what you're after.

Cheers,

Jan


Thank you all for you answers to clarified I built a collection of
dictionnaries which represent database query on a bug tracking system:

backlog_tables , csv_backlog_table = _backlog_database(table=table,
periods=intervals_list)

backlog_tables is a dictionnary of bug info dictionnaries. The keys of
backlog_tables is a time intervall (YEAR-MONTH) as shown below:

backlog_tables= {'2011-01-01': [{'Assigned Date': datetime.date(2010,
10, 25),
'Category': 'Customer_claim',
'Date': datetime.date(2010, 10, 22),
'Duplicate Date': None,
'Fixed Reference': None,
'Headline': 'Impovement for all test',
'Identifier': '23269',
'Last Modified': datetime.date(2010, 10, 25),
'Priority': 'Low',
'Project': 'MY_PROJECT',
'Reference': 'MY_PROJECT@1.7beta2@20101006.0',
'Resolved Date': None,
'Severity': 'improvement',
'State': 'A',
'Submitter': 'Somebody'},
.....
}

_backlog_database() compute the tuple backlog_tables , csv_backlog_table:
In fact csv_backlog_table is the same as backlog_tables but instead of
having
the query dictionnaries it holds only the number of query which I use to
create
a CSV file and a graph over time range.

_backlog_database() is as follow:

def _backlog_database(table=None, periods=None):
"""Internal function. Re-arrange database table
according to a time period. Only monthly management
is computed in this version.

@param table the database of the list of defects. Each defect is a
dictionnary with fixed keys.
@param periods the intervals list of months and the first element is the
starting date and the
the last element is the ending date in string format.
@return (periods_tables, csv_table), a tuple of periodic dictionnary
table and
the same keys dictionnary with defect numbers associated values.
"""
if periods is None:
raise ValueError('Time interval could not be empty!')

periods_tables = {}
csv_table = {}

interval_table = []

for interval in periods:
split_date = interval.split('-')
for row in table:
if not len(split_date) == 3:
limit_date = _first_next_month_day(year=int(split_date[0]),
month=int(split_date[1]), day=1)
if row['Date'] < limit_date:
if not row['Resolved Date']:
if row['State'] == 'K':
if row['Last Modified'] >= limit_date:
interval_table.append(row)
elif row['State'] == 'D':
if row['Duplicate Date'] >= limit_date:
interval_table.append(row)
# New, Assigned, Opened, Postponed, Forwarded, cases.
else:
interval_table.append(row)
else:
if row['Resolved Date'] >= limit_date:
interval_table.append(row)

periods_tables[interval] = interval_table
csv_table[interval] = str(len(interval_table))

interval_table = []

return periods_tables, csv_table


This is not the whole function I reduce it on normal case but it shows
what I am doing.
In fact I choose to have both dictionnaries to debug my function and
analyse what's going
on. When everything will be fine I will need only the csv table (with
number per period) to create the graphs.
That's why I was asking for length computing. Honnestly, the actual
queries number is 500 (bug id) but It could be more
in other project. I was ambitious when I sais 1000 to 100000
dictionnaries elements but for the whole
list of products we have internally It could be 50000.

I see some similarity with my coding style (doing things "by the way"), which might not be so good ;-).

With this background information I would keep the responsibilities seperated. Your _backlog_database() function is supposed to do one thing: Return a dictionary which holds the interval and a list of result dicts. You could call this dict interval_to_result_tables (to indicate that the values are lists). That's all your function should do.

Then you want to print a report. This piece of functionality needs to know how long the lists for each dictionary entry are. Then this print_report function should be responsible to get the information it needs by creating it itself or calling another function, which has the purpose to create the information. Latter would be a bit too much, as the length would be simply be:

number_of_tables = len(interval_to_result_tables[interval])

I hope I understood your goals correctly and could help a bit,

Jan





Regards
Karim



Karim
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to