In python 2, your use of range() without checking for a very large parameter n might cause either a MemoryError exception, or trigger a huge memory allocation just for the range list. Not a problem in python 3 of course.
On 8 June 2017 at 09:54, Stestagg <stest...@gmail.com> wrote: > I honestly can't see a way to improve this in python. My best solution is: > > def main(lines): > stack = [] > sa = stack.append > sp = stack.pop > si = stack.__getitem__ > for line in lines: > meth = line[:3] > if meth == b'pus': > sa(int(line[5:])) > elif meth == b'pop': > sp() > else: > parts = line[15:].split() > end = len(stack)-1 > amount = int(parts[1]) > for x in range(int(parts[0])): > index = end - x > stack[index] += amount > print(stack[-1] if stack else None) > > which comes out about 25% faster than your solution. > > One tool that's interesting to use here is: line_profiler: > https://github.com/rkern/line_profiler > > putting a @profile decorator on the above main() call, and running with > kernprof produces the following output: > > Line # Hits Time Per Hit % Time Line Contents > > ============================================================== > > 12 @profile > > 13 def main(lines): > > 14 1 4 4.0 0.0 stack = [] > > 15 2000001 949599 0.5 11.5 for line in lines: > > 16 2000000 1126944 0.6 13.7 meth = line[:3] > > 17 2000000 974635 0.5 11.8 if meth == b'pus': > > 18 1000000 1002733 1.0 12.2 > stack.append(int(line[5:])) > > 19 1000000 478756 0.5 5.8 elif meth == > b'pop': > > 20 999999 597114 0.6 7.2 stack.pop() > > 21 else: > > 22 1 6 6.0 0.0 parts = > line[15:].split() > > 23 1 2 2.0 0.0 end = > len(stack)-1 > > 24 1 1 1.0 0.0 amount = > int(parts[1]) > > 25 500001 241227 0.5 2.9 for x in > range(int(parts[0])): > > 26 500000 273477 0.5 3.3 index = end > - x > > 27 500000 309033 0.6 3.7 > stack[index] += amount > > 28 2000000 2295803 1.1 27.8 print(stack[-1]) > > > which shows that there's no obvious bottleneck (line by line) here (for my > sample data). > > Note the print() overhead dominates the runtime, and that's with me piping > the output to /dev/null directly. > > I had a go at using arrays, deques, and numpy arrays in various ways without > luck, but we're getting fairly close to the native python statement > execution overhead here (hence folding it all into one function). > > My only thoughts would be to see if there were some magic that could be done > by offloading the work onto a non-python library somehow. > > Another thing that might help some situations (hence my previous questions) > would be to implement the add_to_first_n as a lazy operator (i.e. have a > stack of the add_to_first_n values and dynamically add to the results of > pop() but that would proabably be much slow in the average case. > > Steve > > On Wed, Jun 7, 2017 at 7:34 PM Jonathan Hartley <tart...@tartley.com> wrote: >> >> Hey. >> >> Thanks for engaging, but I can't help with the most important of those >> questions - the large data sets on which my solution failed due to timeout >> are hidden from candidates. Not unreasonable to assume that they do exercise >> deep stacks, and large args to add_to_first_n, etc. >> >> Yes, the input looks exactly like your example. All args are integers. The >> question asked for output corresponding to the top of the stack after every >> operation. I omitted this print from inside the 'for' loop in 'main', >> thinking it irrelevant. >> >> I converted to integers inside 'dispatch'. 'args' must have actually been >> created with: >> >> args = [int(i) for i in tokens[1:]] >> >> Where len(tokens) is never going to be bigger than 3. >> >> Return values (from 'pop') were unused. >> >> >> On 6/7/2017 13:25, Stestagg wrote: >> >> Do you have any more context? >> For example, is the add_to_first_n likely to be called with very large >> numbers, or very often? Does the stack get very deep, or stay shallow? >> >> I'm assuming that lines look like this: >> >> push 1 >> push 2 >> add_to_first_n 2 10 >> pop >> pop >> >> with all arguments as integers, and the final value being returned from >> main()? >> How did you convert from string inputs to numeric values? >> How did you manage return values? >> >> :D >> >> On Wed, Jun 7, 2017 at 6:51 PM Jonathan Hartley <tart...@tartley.com> >> wrote: >>> >>> I recently submitted a solution to a coding challenge, in an employment >>> context. One of the questions was to model a simple stack. I wrote a >>> solution which appended and popped from the end of a list. This worked, but >>> failed with timeouts on their last few automated tests with large (hidden) >>> data sets. >>> >>> From memory, I think I had something pretty standard: >>> >>> class Stack: >>> >>> def __init__(self): >>> self.storage = [] >>> >>> def push(arg): >>> self.storage.append(arg) >>> >>> def pop(): >>> return self.storage.pop() if self.storage else None >>> >>> def add_to_first_n(n, amount): >>> for n in range(n): >>> self.storage[n] += amount >>> >>> def dispatch(self, line) >>> tokens = line.split() >>> method = getattr(self, tokens[0]) >>> args = tokens[1:] >>> method(*args) >>> >>> def main(lines): >>> stack = Stack() >>> for line in lines: >>> stack.dispatch(line) >>> >>> >>> (will that formatting survive? Apologies if not) >>> >>> Subsequent experiments have confirmed that appending to and popping from >>> the end of lists are O(1), amortized. >>> >>> So why is my solution too slow? >>> >>> This question was against the clock, 4th question of 4 in an hour. So I >>> wasn't expecting to produce Cython or C optimised code in that timeframe >>> (Besides, my submitted .py file runs on their servers, so the environment is >>> limited.) >>> >>> So what am I missing, from a performance perspective? Are there other >>> data structures in stdlib which are also O(1) but with a better constant? >>> >>> Ah. In writing this out, I have begun to suspect that my slicing of >>> 'tokens' to produce 'args' in the dispatch is needlessly wasting time. Not >>> much, but some. >>> >>> Thoughts welcome, >>> >>> Jonathan >>> >>> -- >>> Jonathan Hartley tart...@tartley.com http://tartley.com >>> Made out of meat. +1 507-513-1101 twitter/skype: tartley >>> >>> _______________________________________________ >>> python-uk mailing list >>> python-uk@python.org >>> https://mail.python.org/mailman/listinfo/python-uk >> >> >> >> _______________________________________________ >> python-uk mailing list >> python-uk@python.org >> https://mail.python.org/mailman/listinfo/python-uk >> >> >> -- >> Jonathan Hartley tart...@tartley.com http://tartley.com >> Made out of meat. +1 507-513-1101 twitter/skype: tartley >> >> _______________________________________________ >> python-uk mailing list >> python-uk@python.org >> https://mail.python.org/mailman/listinfo/python-uk > > > _______________________________________________ > python-uk mailing list > python-uk@python.org > https://mail.python.org/mailman/listinfo/python-uk > _______________________________________________ python-uk mailing list python-uk@python.org https://mail.python.org/mailman/listinfo/python-uk