Dear Pythonists, I'm using Python 2.7. on Win 7.
Problem description: Currently, I am working on a reinforcement learning paradigm, where I would like to update Qa values with alfaG [if decision_input = 1 and feedback_input = 1] or with alfaL [ if decision_input = 1 and feedback_value = 0]. (1) So, I have two lists for input (with two values) : decision_input = [1,1] - this could be 1,2,3,4,5,6 feedback_input = [1,0] - the value is either 1 or zero (2) The equation is the following for gain: Qa = Qa+(alfaG*(feedback_input-Qa)) thus, I would like to use alfaG only if the i-th element of feedback_input is 1 for lose: Qa = Qa+(alfaL*(feedback_input-Qa)) thus, only if the i-th element of feedback_input is zero Qa value is initialized to zero. (3) Incrementing alfaG and alfaL independently after updating the Qa value alfaG = 0.01 - initial value alfaL = 0.01 - initial value (4) The problematic code :( decision_input = [1,1] feedback_input = [1,0] a = [] alfaG = 0.01 alfaL = 0.01 value = 0.04 for i in range(len(decision_input)): if decision_input[i] == 1 and feedback_input[i] == 1: while alfaG < value: Qa = 0 for feedb in feedback_input: Qa = Qa+(alfaG*(feedb-Qa)) a.append(Qa) if decision_input[i] == 1 and feedback_input[i] == 0: while alfaL < value: for feedb in feedback_input: Qa = Qa+(alfaL*(feedb-Qa)) a.append(Qa) alfaL += 0.01 alfaG += 0.01 print a after this, I've got the following output: [0.01, 0.099], [0.02, 0.0196], [0.03, 0.0291] (5) I have no idea, how to get the following output: [0.01, 0.099], [0.01, 0.098], [0.01, 0.097] -->thus: alfaG = 0.01, alfaL = 0.01, 0.02, 0.03 [0.02, 0.0198], [0.02, 0.0196], [0.02, 0.0194] -->thus: alfaG = 0.02, alfaL = 0.01, 0.02, 0.03 [0.03, 0.0297], [0.03, 0.0294], [0.03, 0.0291] -->thus: alfaG = 0.03, alfaL = 0.01, 0.02, 0.03 Since both alfaG and alfaL have 3 values, I have 3x3 lists. Does anyone have an idea, how to modify the code? Best regards, Zsolt
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor