Mats, Thank you!
So I included the QUOTE_NONNUMERIC to my csv.reader() call and it almost worked. Now, how wonderful that the scope's csv file simply wrote an s for seconds and didn't include quotes. Now Python tells me it can't create a float of s. Of course I can't edit a 4G file in any editor that I have installed, so I have to work with the fact that there is a bit of text in there that isn't quoted. Which leads me to another question related to working with these csv files. Is there a way for me to tell the reader to skip the first 'n' rows? Or, for that matter, skip rows in the middle of the file? A this point, I think it may be less painful for me to just skip those few lines that have text. I don't believe there will be any loss of accuracy. But, since row is not really an index, how does one conditionally skip a given set of row entries? I started following the link to iterables but quickly got lost in the terminology. Best, On Mon, Jul 15, 2019 at 3:03 PM Mats Wichmann <m...@wichmann.us> wrote: > On 7/15/19 12:35 PM, Chip Wachob wrote: > > Oscar and Mats, > > > > Thank you for your comments and taking time to look at the snips. > > > > Yes, I think I had commented that the avg+trigger was = triggervolts in > > my original post. > > > > I did find that there was an intermediary process which I had forgotten > > to comment out that was adversely affecting the data in one instance and > > not the other. So it WAS a case of becoming code blind. But I didn't > > give y'all all of the code so you would not have known that. My > apologies. > > > > Mats, I'd like to get a better handle on your suggestions about > > improving the code. Turns out, I've got another couple of 4GByte files > > to sift through, and they are less 'friendly' when it comes to > > determining the start and stop points. So, I have to basically redo > > about half of my code and I'd like to improve on my Python coding skills. > > > > Unfortunately, I have gaps in my coding time, and I end up forgetting > > the details of a particular language, especially a new language to me, > > Python. > > > > I'll admit that my 'C' background keeps me thinking as these data sets > > as arrays.. in fact they are lists, eg: > > > > [ > > [t0, v0], > > [t1, v1], > > [t2, v2], > > . > > . > > . > > [tn, vn] > > ] > > > > Time and volts are floats and need to be converted from the csv file > > entries. > > > > I'm not sure that follow the "unpack" assignment in your example of: > > > > for row in TrigWind: > > time, voltage = row # unpack > > > > I think I 'see' what is happening, but when I read up on unpacking, I > > see that referring to using the * and ** when passing arguments to a > > function... > > That's a different aspect of unpacking. This one is sequnce unpacking, > sometimes called tuple (or seqeucence) assignment. In the official > Python docs it is described in the latter part of this section: > > https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences > > > > I tried it anyhow, with this being an example of my source data: > > > > "Record Length",2000002,"Points",-0.005640001706,1.6363 > > "Sample Interval",5e-09,s,-0.005639996706,1.65291 > > "Trigger Point",1128000,"Samples",-0.005639991706,1.65291 > > "Trigger Time",0.341197,s,-0.005639986706,1.60309 > > ,,,-0.005639981706,1.60309 > > "Horizontal Offset",-0.00564,s,-0.005639976706,1.6363 > > ,,,-0.005639971706,1.65291 > > ,,,-0.005639966706,1.65291 > > ,,,-0.005639961706,1.6363 > > . > > . > > . > > > > Note that I want the items in the third and fourth column of the csv > > file for my time and voltage. > > > > When I tried to use the unpack, they all came over as strings. I can't > > seem to convert them selectively.. > > That's what the csv module does, unless you tell it not to. Maybe this > will help: > > https://docs.python.org/3/library/csv.html#csv.reader > > There's an option to convert unquoted values to floats, and leave quoted > values alone as strings, which would seem to match your data above quite > well. > > > Desc1, Val1, Desc2, TimeVal, VoltVal = row > > > > TimeVal and VoltVal return type of str, which makes sense. > > > > Must I go through yet another iteration of scanning TimeVal and VoltVal > > and converting them using float() by saving them to another array? > > > > > > Thanks for your patience. > > > > Chip > > > > > > > > > > > > > > > > > > > > On Sat, Jul 13, 2019 at 9:36 AM Mats Wichmann <m...@wichmann.us > > <mailto:m...@wichmann.us>> wrote: > > > > On 7/11/19 8:15 AM, Chip Wachob wrote: > > > > kinda restating what Oscar said, he came to the same conclusions, I'm > > just being a lot more wordy: > > > > > > > So, here's where it gets interesting. And, I'm presuming that > > someone out > > > there knows exactly what is going on and can help me get past this > > hurdle. > > > > Well, each snippet has some "magic" variables (from our point of > view, > > since we don't see where they are set up): > > > > 1: if(voltage > (avg + triglevel) > > > > 2: if((voltage > triggervolts) > > > > since the value you're comparing voltage to gates when you decide > > there's a transition, and thus what gets added to the transition list > > you're building, and the list size comes out different, and you claim > > the data are the same, then guess where a process of elimination > > suggests the difference is coming from? > > > > === > > > > Stylistic comment, I know this wasn't your question. > > > > > for row in range (len(TrigWind)): > > > > Don't do this. It's not a coding error giving you wrong results, but > > it's not efficient and makes for harder to read code. You already > have > > an iterable in TrigWind. You then find the size of the iterable and > use > > that size to generate a range object, which you then iterate over, > > producing index values which you use to index into the original > > iterable. Why not skip all that? Just do > > > > for row in TrigWind: > > > > now row is actually a row, as the variable name suggests, rather > than an > > index you use to go retrieve the row. > > > > Further, the "row" entries in TrigWind are lists (or tuples, or some > > other indexable iterable, we can't tell), which means you end up > > indexing into two things - into the "array" to get the row, then into > > the row to get the individual values. It's nicer if you unpack the > rows > > into variables so they can have meaningful names - indeed you > already do > > that with one of them. Lets you avoid code snips like "x[7][1]" > > > > Conceptually then, you can take this: > > > > for row in range(len(Trigwind)): > > voltage = float(TrigWind[row][1]) > > ... > > edgearray.append([float(TrigWind[row][0]), > > float(TrigWind[row][1])]) > > ... > > > > and change to this: > > > > for row in TrigWind: > > time, voltage = row # unpack > > .... > > edgearray.append([float)time, float(voltage)]) > > > > or even more compactly you can unpack directly at the top: > > > > for time, voltage in TrigWind: > > ... > > edgearray.append([float)time, float(voltage)]) > > ... > > > > Now I left an issue to resolve with conversion - voltage is not > > converted before its use in the not-shown comparisons. Does it need > to > > be? every usage of the values from the individual rows here uses them > > immediately after converting them to float. It's usually better not > to > > convert all over the place, and since the creation of TrigWind is > under > > your own control, you should do that at the point the data enters the > > program - that is as TrigWind is created; then you just consume data > > from it in its intended form. But if not, just convert voltage > before > > using, as your original code does. You don't then need to convert > > voltage a second time in the list append statements. > > > > for time, voltage in TrigWind: > > voltage = float(voltage) > > ... > > edgearray.append([float)time, voltage]) > > ... > > > > > > _______________________________________________ > > Tutor maillist - Tutor@python.org <mailto:Tutor@python.org> > > To unsubscribe or change subscription options: > > https://mail.python.org/mailman/listinfo/tutor > > > > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor