Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen



On 07/02/2019 18:06, Peter Otten wrote:

Sorry, I don't understand the question.


after a quick look not unlike what you propose but I have to investigate 
further,


lengths of chunks are known or can be found (sketchy):

order= [%i,%q,%r,%w,%p,%P,%o,%m,%g,%E,%s,%e,%F,%a,%A,%f,%t,%l,%n,%v,%c,%C]
length=[ 1, 3, 1, 1,%w,%w,%w, 1, 1, 1, 1,%s, 1,%s,%a,%s,%s,%s,%s, 1, 3, 3]

from there calculate the slices per line
slices={"%i":(0,1), "%q":(1,4), "%r":(4:5)etc

modify all functions to accept and deal with the slice tuple, then the 
action loop gets very simple:


for points, line in enumerate(open("vorodat.txt.vol",'r'), 1):
  line = line.strip()
  line = line.split(" ")
  slices = calculate_slices(line)
  function[action](content[action],slices[action])

thanks for your time and insight, I'll try a few different ways

ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread Peter Otten
ingo janssen wrote:

> 
> 
> On 07/02/2019 11:08, Peter Otten wrote:
>> replace the sequence of tests with dictionary lookups
> 
> updated the gist a few times, now I could pre calculate the slices to be
> taken per line, but will there be much gain compared to the copping from
> the left side of the list?

Sorry, I don't understand the question.


Looking at your code

> if action == "%i":
> lbl = function[action](content[action])

you really do not need the function[action] lookup here because you know 
that the result will always be f_number. Likewise you could bind 
content["%i"] to a name, labels, say, and then write

if action == "%s":
lbl = f_number(labels)

which I find much more readable. 

A lookup table only makes sense if provides all necessary information. I 
tried to apply the idea to one of your gist versions:

def set_lbl(items):
global lbl
lbl = f_number(items)

def set_w(items):
global v
v = f_number(items)

def set_f(items):
global f
f = f_number(items)

def set_mx(items):
global mx
mx = mx_value_array(items, f)

function = {
"%i" : set_lbl,
"%w" : set_w,
"%s" : set_f,
"%a" : set_mx,
"%q" : f_vector,
"%r" : f_value,
"%p" : lambda items: f_vector_array(items, v),
"%P" : lambda items: f_vector_array(items, v),
"%o" : lambda items: f_value_array(items, v),
"%m" : f_value,
"%g" : f_number,
"%E" : f_value,
"%e" : lambda items: f_value_array(items, f),
"%F" : f_value,
"%A" : lambda items: f_value_array(items, mx + 1),
"%f" : lambda items: f_value_array(items, f),
"%t" : lambda items: f_nested_value_array(items, f),
"%l" : lambda items: f_vector_array(items, f),
"%n" : lambda items: f_value_array(items, f),
"%v" : f_value,
"%c" : f_vector,
"%C" : f_vector
} 

order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c %C"
order = re.findall("%[a-z]",order,re.M|re.I)
content = {}

actions = []

for i in order:
items = content[i] = []
actions.append(partial(function[i], items))

for points, line in enumerate(open("vorodat.txt.vol",'r'), 1):
line = line.strip()
line = line.split(" ")
for action in actions:
action()

However, while the loop is rather clean now the rest of the code is 
sprinkled with implicit arguments and thus much worse than what you have.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen




On 07/02/2019 11:08, Peter Otten wrote:

replace the sequence of tests with dictionary lookups


updated the gist a few times, now I could pre calculate the slices to be 
taken per line, but will there be much gain compared to the copping from 
the left side of the list?


ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen




On 07/02/2019 11:08, Peter Otten wrote:

Personally I would avoid the NameError and start with empty lists. If you
manage to wrap all branches into functions with the same signature you can
replace the sequence of tests with dictionary lookups.


Just before I saw your post I put my current code up here:

https://gist.github.com/ingoogni/e99c561f23777e59a5aa6b4ef5fe37c8

I will study yours,

ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen




On 07/02/2019 10:40, Alan Gauld via Tutor wrote:

Just saves a little typing is all.


Sensei,

be lazy, I will study


current state of code is at
https://gist.github.com/ingoogni/e99c561f23777e59a5aa6b4ef5fe37c8

ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread Peter Otten
ingo janssen wrote:

> 
> On 07/02/2019 09:29, Peter Otten wrote:
>> Where will you get the order from?
> 
> Peter,
> 
> the order comes from the command line. 

Then my one-function-per-format approach won't work.

> I intend to call the python
> program with the same command line options as the Voro++ program. Make
> the python program call the Voro++ and process its output.
> 
> one command line option contains a string for formatting the output.
> That is what I use for order.
> 
> #all output formatting options
> order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c
> %C" order = re.findall("%[a-z]",order,re.M|re.I)
> for i, line in enumerate(open("vorodat.vol",'r')):
>points = i
>line = line.strip()
>line = line.split(" ")
>for action in order:
>  if action == "%i":
>try:
>  lbl = f_label(label)
>except NameError as e:
>   lbl = f_number(label)
>   label=[lbl]

Personally I would avoid the NameError and start with empty lists. If you 
manage to wrap all branches into functions with the same signature you can 
replace the sequence of tests with dictionary lookups. Here's a sketch:


# the f_...() functions take a parsed line and return a value and the
# as yet unused rest of the parsed line

labels = []
points = []
...
def add_label(parts):
   label, rest = f_label(parts)
   labels.append(label)
   return rest

def add_point(parts):
point, rest = f_vector(parts)
points.append(point)
return rest

def add_point(parts):
global width
width, rest = f_width(parts)
return rest

lookup_actions = {
"%i": add_label,
"%q": add_point,
"%w": set_width,
...
}

actions = [lookup_actions[action] for action in order]

with open("vorodat.vol") as instream:
for points, line in enumerate(instream, 1):  # as per Mark's advice
width = None  # dummy value to provoke error when width
  # is not explicitly set
parts = line.split()
for action in actions:
parts = actions(parts)


>  elif action == "%q":
>try:
>  f_vector(point)
>except NameError as e:
>point = [f_vector(point)]
>  elif action == "%r":
>try:
>  f_value(radius)
>except NameError as e:
>  radius=[f_value(radius)]
> etc.
> 
> order is important as %w tells me how long %p, %P and %o will be. This
> varies per line.
> 
> I'll look into what you wrote,

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread Alan Gauld via Tutor
On 07/02/2019 08:58, ingo janssen wrote:

>try:
>  lbl = f_label(label)
>except NameError as e:
>   lbl = f_number(label)
>   label=[lbl]

Just a minor point but since you aren't doing
anything with the error you don't need the
'as e' bit at the end of each line...

Just saves a little typing is all.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen



On 07/02/2019 09:58, ingo janssen wrote:


On 07/02/2019 09:29, Peter Otten wrote:

Where will you get the order from?




Ahrg, that should have been:


#all output formatting options
order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c %C"
order = re.findall("%[a-z]",order,re.M|re.I)
for i, line in enumerate(open("vorodat.vol",'r')):
   points = i
   line = line.strip()
   line = line.split(" ")
   for action in order:
     if action == "%i":
   try:
     lbl = f_label(label)
   except NameError as e:

label=[]
  lbl = f_number(label)  
     elif action == "%q":

   try:
     f_vector(point)
   except NameError as e:

 point = []
     f_vector(point)

     elif action == "%r":
   try:
     f_value(radius)
   except NameError as e:

   radius = []

    f_value(radius)

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen



On 07/02/2019 09:29, Peter Otten wrote:

Where will you get the order from?


Peter,

the order comes from the command line. I intend to call the python 
program with the same command line options as the Voro++ program. Make 
the python program call the Voro++ and process its output.


one command line option contains a string for formatting the output. 
That is what I use for order.


#all output formatting options
order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c %C"
order = re.findall("%[a-z]",order,re.M|re.I)
for i, line in enumerate(open("vorodat.vol",'r')):
  points = i
  line = line.strip()
  line = line.split(" ")
  for action in order:
if action == "%i":
  try:
lbl = f_label(label)
  except NameError as e:
 lbl = f_number(label)
 label=[lbl]
elif action == "%q":
  try:
f_vector(point)
  except NameError as e:
  point = [f_vector(point)]
elif action == "%r":
  try:
f_value(radius)
  except NameError as e:
radius=[f_value(radius)]
etc.

order is important as %w tells me how long %p, %P and %o will be. This 
varies per line.


I'll look into what you wrote,

thanks,

ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread Peter Otten
ingo janssen wrote:

> depending on how the input file is created data packet a can be in an
> other position for every line.
> figured out how to do it though
> 
> order=[a,b,e,d...]
> for i in lines:
>i=i.split(" ")
>  for j in order:
>if j = a:
> use function for processing data chunk a
>elseif j = b:
>  use proper function for processing data type b
>...

Where will you get the order from? If you plan to specify it manually, e. g.

lookup_steps = {
"foo": [a, b, c, ...],
"bar": [a, a, f, ...],
}
fileformat =  sys.argv[1]
steps = lookup_steps[fileformat]
...
for line in lines:
for step in steps:
if step == a:
...
elif step == b:
...

then I recommend storing one function per file format instead:

def process_foo(line):
...  # process one line in foo format

def process_bar(line):
...

lineprocessors = {
"foo": process_foo,
"bar": process_bar,
}
fileformat =  sys.argv[1]
process = lineprocessors[fileformat]
...
for line in lines:
process(line)

That way you deal with Python functions instead of a self-invented 
minilanguage.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-06 Thread ingo janssen




On 06/02/2019 21:45, Mark Lawrence wrote:

So what, you still don't need to chop the front from the list, just 
process the data.


just slice




I'd like to adapt the order in that the functions are applied, but how?


I suspect that you're trying to over complicate things, what's wrong 
with a simple if/elif chain, a switch based on a dict or similar?




You mean create a list with the order=[a,b,e,d...]

Again I've no idea what you're saying here.


depending on how the input file is created data packet a can be in an 
other position for every line.

figured out how to do it though

order=[a,b,e,d...]
for i in lines:
  i=i.split(" ")
for j in order:
  if j = a:
use function for processing data chunk a
  elseif j = b:
use proper function for processing data type b
  ...

I don't know beforehand 
how many lines I have.


Now you tell us :-(


sorry


then loop order=[a,b,e,d...] for each line



What has a loop order got to do with using a dict?


order of data chunks varies per file

Why bother, just have a list of lists and index on the position, or are 
we talking at cross purposes?


Sorry for the amount of text below, I hope it clarifies
one line of space delimited input data:

0 1094.82 0.1 582.419 0.5 14 (0.200231,1.13714,-8.35338) 
(-10.2097,1.13714,-4.05001) (-10.2097,-14.3466,-4.05001) 
(-2.4419,-39.895,9.65513) (-0.382375,-100.1,7.27361) 
(0.200231,-100.1,-8.35338) (-2.43137,1.58294,9.64296) 
(-10.1818,1.514,-4.00085) (-2.4419,1.51399,9.65513) 
(3.73705,-100.1,2.51013) (0.220825,1.58294,-8.29013) 
(-6.42082,-100.1,-5.61629) (-10.1626,1.58294,-3.9977) 
(3.73705,1.58294,2.51013) (1095.02,1.23714,574.066) 
(1084.61,1.23714,578.369) (1084.61,-14.2466,578.369) 
(1092.38,-39.795,592.074) (1094.44,-100,589.693) (1095.02,-100,574.066) 
(1092.39,1.68294,592.062) (1084.64,1.614,578.418) 
(1092.38,1.61399,592.074) (1098.56,-100,584.929) 
(1095.04,1.68294,574.129) (1088.4,-100,576.803) 
(1084.66,1.68294,578.421) (1098.56,1.68294,584.929) 3 3 3 3 3 3 3 3 3 3 
3 3 3 3 10092.8 21 550.726 9 23.4034 221.001 102.986 190.388 219.178 
39.1211 226.154 47.7032 31.5186 4765.01 5 5 5 4 6 4 5 4 4 0 0 0 0 4 4 1 
5.07336 964.581 451.085 1100.75 865.736 81.7357 1161.69 133.262 1.10745 
(1,0,10,12,7) (1,2,11,5,0) (1,7,8,3,2) (2,3,4,11) (3,8,6,13,9,4) 
(4,9,5,11) (5,9,13,10,0) (6,12,10,13) (6,8,7,12) 
(-0.377877,0.147157,-0.914086) (-0.382036,2.8913e-18,-0.924147) 
(-0.869981,0,0.493086) (-0.904528,-0.0477043,0.423738) 
(0.75639,-5.72053e-15,0.654121) (-0,-1,0) 
(0.950875,4.0561e-18,-0.309575) (-5.99268e-17,1,-1.44963e-16) 
(-0.849681,0.21476,0.481581) 9205 9105 3062 9946 5786 -3 1483 100 3262 
11680.5 -2.00777 -44.9048 -0.428504 1092.81 -44.8048 581.99


this one line as it is in the output file. For a file with 10 lines 
the outer arrays will be 10 items long:


#declare Labels = array[0]{0}
#declare Points = array[0]{<1094.82,0.1,582.419>}
#declare Radii = array[0]{0.5}
#declare NumVertices = array[0]{14}
#declare RelVertices = array[0]{
  //label: 0
  array[14]{

<0.200231,1.13714,-8.35338>,<-10.2097,1.13714,-4.05001>,<-10.2097,-14.3466,-4.05001>,<-2.4419,-39.895,9.65513>,<-0.382375,-100.1,7.27361>,<0.200231,-100.1,-8.35338>,<-2.43137,1.58294,9.64296>,<-10.1818,1.514,-4.00085>,<-2.4419,1.51399,9.65513>,<3.73705,-100.1,2.51013>,<0.220825,1.58294,-8.29013>,<-6.42082,-100.1,-5.61629>,<-10.1626,1.58294,-3.9977>,<3.73705,1.58294,2.51013>
  }
}
#declare GlobalVertices = array[0]{
  //label: 0
  array[14]{

<1095.02,1.23714,574.066>,<1084.61,1.23714,578.369>,<1084.61,-14.2466,578.369>,<1092.38,-39.795,592.074>,<1094.44,-100,589.693>,<1095.02,-100,574.066>,<1092.39,1.68294,592.062>,<1084.64,1.614,578.418>,<1092.38,1.61399,592.074>,<1098.56,-100,584.929>,<1095.04,1.68294,574.129>,<1088.4,-100,576.803>,<1084.66,1.68294,578.421>,<1098.56,1.68294,584.929>
  }
}
#declare MaxRadius = array[0]{10092.8}
#declare NumEdges = array[0]{21}
#declare EdgeDistance = array[0]{550.726}
#declare NumFaces = array[0]{9}
#declare FacePerimeter = array[0]{
  //label: 0

array[9]{23.4034,221.001,102.986,190.388,219.178,39.1211,226.154,47.7032,31.5186}
}
#declare SurfaceArea = array[0]{4765.01}
#declare FacesOrders = array[0]{
  //label: 0
  array[9]{5,5,5,4,6,4,5,4,4}
}
#declare FreqFaces = array[0]{
  //label: 0
  array[7]{0,0,0,0,4,4,1}
}
#declare FaceArea = array[0]{
  //label: 0

array[9]{5.07336,964.581,451.085,1100.75,865.736,81.7357,1161.69,133.262,1.10745}
}
#declare FaceVerticesIndex = array[0]{
  //label: 0
  array[9]{
array[5]{1,0,10,12,7},
array[5]{1,2,11,5,0},
array[5]{1,7,8,3,2},
array[4]{2,3,4,11},
array[6]{3,8,6,13,9,4},
array[4]{4,9,5,11},
array[5]{5,9,13,10,0},
array[4]{6,12,10,13},
array[4]{6,8,7,12},
  }
}
#declare FaceNormal = array[0]{
  //label: 0
  array[9]{


Re: [Tutor] text processing lines variable content

2019-02-06 Thread Mark Lawrence

On 06/02/2019 18:51, ingo janssen wrote:


On 06/02/2019 19:07, Mark Lawrence wrote:

That's going to a lot of work slicing and dicing the input lists. 
Perhaps a chunked recipe like this 
https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.chunked 
would be better.


The length of the text chunks varies from a single character to a list 
of ~30 3D vectors.


So what, you still don't need to chop the front from the list, just 
process the data.





I'd like to adapt the order in that the functions are applied, but how?


I suspect that you're trying to over complicate things, what's wrong 
with a simple if/elif chain, a switch based on a dict or similar?




You mean create a list with the order=[a,b,e,d...]
if a in order:
   f_vector_array(a, 3)
elseif b in order:
   f_value(max_radius)

that would run the proper function, but not in the right order?


Again I've no idea what you're saying here.





for i, line in enumerate(open("vorodat.vol",'r')):
   points = i+1


enumerate takes a start argument so you shouldn't need the above line.


points is needed later on in the program and I don't know beforehand how 
many lines I have.


Now you tell us :-(



I thought about putting the functions in a dict and then create a 
list with the proper order, but can't get it to work.


Please show us your code and exactly why it didn't work.



def f_vector_array(outlist, length):
   rv = pop_left_slice(line, length)
   rv = [f'<{i[1:-1]}>' for i in rv]  #i format is: '(1.234,2.345,3.456)'
   rv = ",".join(rv)
   outlist.append(f"  //label: {lbl}\n  array[{length}]"+"{\n "+rv+"\n 
}\n")


functions={
  'a':f_number(num_vertex),
  'b':f_vector_array(rel_vertex,v)
}
where rel_vertex is the list where to move the processed data to and v 
the amount of text to chop of the front of the line. v is not known when 
defining the dictionary. v comes from an other function 
v=f_number(num_vertex) that also should live in the dict.


You don't need to specify the parameters in the dict, just give the 
function name.


then loop order=[a,b,e,d...] for each line



What has a loop order got to do with using a dict?



I'm not absolutely sure what you're saying here, but would something 
like the SortedList from 
http://www.grantjenks.com/docs/sortedcontainers/ help?


Maybe this explains it better, assume the split input lines:
line1=[a,b,c,d,e,f,...]
line2=[a,b,c,d,e,f,...]
line3=[a,b,c,d,e,f,...]
...
line10=...

all data on position a should go to list a

a=[a1,a2,a3,...a_n]
b=[b1,b2,b3,...b_n]
c=[c1,c2,c3,...n_n]
etc.

this is what for example the function f_vector_array(a, 3) does.


Why bother, just have a list of lists and index on the position, or are 
we talking at cross purposes?




All these lists have to be written to a single file, each list contains 
10 items. Instead of keeping it all in memory I could write a1 to a 
temp file A instead of putting it in a list first and b1 to a temp file 
B etc. in the next loop a2 to file A, b2 to file B etc. When all lines 
are processed combine the files A,B,C ... to a single file. Or is there 
a more practical way? Speed is not important.


What is your definition of "combine the files A,B,C ... to a single file"?



ingo


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-06 Thread ingo janssen


On 06/02/2019 19:07, Mark Lawrence wrote:

That's going to a lot of work slicing and dicing the input lists. 
Perhaps a chunked recipe like this 
https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.chunked 
would be better.


The length of the text chunks varies from a single character to a list 
of ~30 3D vectors.


I'd like to adapt the order in that 
the functions are applied, but how?


I suspect that you're trying to over complicate things, what's wrong 
with a simple if/elif chain, a switch based on a dict or similar?




You mean create a list with the order=[a,b,e,d...]
if a in order:
  f_vector_array(a, 3)
elseif b in order:
  f_value(max_radius)

that would run the proper function, but not in the right order?



for i, line in enumerate(open("vorodat.vol",'r')):
   points = i+1


enumerate takes a start argument so you shouldn't need the above line.


points is needed later on in the program and I don't know beforehand how 
many lines I have.


I thought about putting the functions in a dict and then create a list 
with the proper order, but can't get it to work.


Please show us your code and exactly why it didn't work.



def f_vector_array(outlist, length):
  rv = pop_left_slice(line, length)
  rv = [f'<{i[1:-1]}>' for i in rv]  #i format is: '(1.234,2.345,3.456)'
  rv = ",".join(rv)
  outlist.append(f"  //label: {lbl}\n  array[{length}]"+"{\n "+rv+"\n 
}\n")


functions={
 'a':f_number(num_vertex),
 'b':f_vector_array(rel_vertex,v)
}
where rel_vertex is the list where to move the processed data to and v 
the amount of text to chop of the front of the line. v is not known when 
defining the dictionary. v comes from an other function 
v=f_number(num_vertex) that also should live in the dict.


then loop order=[a,b,e,d...] for each line



I'm not absolutely sure what you're saying here, but would something 
like the SortedList from 
http://www.grantjenks.com/docs/sortedcontainers/ help?


Maybe this explains it better, assume the split input lines:
line1=[a,b,c,d,e,f,...]
line2=[a,b,c,d,e,f,...]
line3=[a,b,c,d,e,f,...]
...
line10=...

all data on position a should go to list a

a=[a1,a2,a3,...a_n]
b=[b1,b2,b3,...b_n]
c=[c1,c2,c3,...n_n]
etc.

this is what for example the function f_vector_array(a, 3) does.

All these lists have to be written to a single file, each list contains 
10 items. Instead of keeping it all in memory I could write a1 to a 
temp file A instead of putting it in a list first and b1 to a temp file 
B etc. in the next loop a2 to file A, b2 to file B etc. When all lines 
are processed combine the files A,B,C ... to a single file. Or is there 
a more practical way? Speed is not important.


ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-06 Thread Mark Lawrence

On 06/02/2019 16:33, ingo janssen wrote:
For parsing the out put of the Voro++ program and writing the data to a 
POV-Ray include file I created a bunch of functions.


def pop_left_slice(inputlist, length):
   outputlist = inputlist[0:length]
   del inputlist[:length]
   return outputlist


That's going to a lot of work slicing and dicing the input lists. 
Perhaps a chunked recipe like this 
https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.chunked 
would be better.




this is used by every function to chop of the required part of the input 
line.
Two examples of the functions that proces a chopped of slice of the line 
and append the data to the approriate list.


def f_vector(outlist):
   x,y,z = pop_left_slice(line,3)
   outlist.append(f"<{x},{y},{z}>,")

def f_vector_array(outlist, length):
   rv = pop_left_slice(line, length)
   rv = [f'<{i[1:-1]}>' for i in rv]  #i format is: '(1.234,2.345,3.456)'
   rv = ",".join(rv)
   outlist.append(f"  //label: {lbl}\n  array[{length}]"+"{\n "+rv+"\n  
}\n")


Every line can contain up to 21 data chunks. Within one file each line 
contains the same amount of chunks, but it varies between files. The 
types of chunks vary and their position varies. I know beforehand how a 
line in a file is constructed. I'd like to adapt the order in that the 
functions are applied, but how?


I suspect that you're trying to over complicate things, what's wrong 
with a simple if/elif chain, a switch based on a dict or similar?




for i, line in enumerate(open("vorodat.vol",'r')):
   points = i+1


enumerate takes a start argument so you shouldn't need the above line.


   line = line.strip()
   line = line.split(" ")
   lbl = f_label(label)
   f_vector(point)


Presumably the above is points?


   f_value(radius)
   v=f_number(num_vertex)
   f_vector_array(rel_vertex,v)
   f_vector_array(glob_vertex,v)
   f_value_array(vertex_orders,v)
   f_value(max_radius)
   e=f_number(num_edge)
   f_value(edge_dist)
   ...etc

I thought about putting the functions in a dict and then create a list 
with the proper order, but can't get it to work.


Please show us your code and exactly why it didn't work.



A second question, all this works for small files with hundreds of 
lines, but some have 10. Then I can get at max 22 lists with 10 
items. Not fun. I tried writing the data to a file "out of sequence", 
not fun either. What would be the way to do this?
I thought about writing each data chunk to a proper temporary file 
instead of putting it in a list first. This would require at max 22 temp 
files and then a merge of the files into one.


I'm not absolutely sure what you're saying here, but would something 
like the SortedList from 
http://www.grantjenks.com/docs/sortedcontainers/ help?




TIA,

ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor




--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] text processing lines variable content

2019-02-06 Thread ingo janssen
For parsing the out put of the Voro++ program and writing the data to a 
POV-Ray include file I created a bunch of functions.


def pop_left_slice(inputlist, length):
  outputlist = inputlist[0:length]
  del inputlist[:length]
  return outputlist

this is used by every function to chop of the required part of the input 
line.
Two examples of the functions that proces a chopped of slice of the line 
and append the data to the approriate list.


def f_vector(outlist):
  x,y,z = pop_left_slice(line,3)
  outlist.append(f"<{x},{y},{z}>,")

def f_vector_array(outlist, length):
  rv = pop_left_slice(line, length)
  rv = [f'<{i[1:-1]}>' for i in rv]  #i format is: '(1.234,2.345,3.456)'
  rv = ",".join(rv)
  outlist.append(f"  //label: {lbl}\n  array[{length}]"+"{\n 
"+rv+"\n  }\n")


Every line can contain up to 21 data chunks. Within one file each line 
contains the same amount of chunks, but it varies between files. The 
types of chunks vary and their position varies. I know beforehand how a 
line in a file is constructed. I'd like to adapt the order in that the 
functions are applied, but how?


for i, line in enumerate(open("vorodat.vol",'r')):
  points = i+1
  line = line.strip()
  line = line.split(" ")
  lbl = f_label(label)
  f_vector(point)
  f_value(radius)
  v=f_number(num_vertex)
  f_vector_array(rel_vertex,v)
  f_vector_array(glob_vertex,v)
  f_value_array(vertex_orders,v)
  f_value(max_radius)
  e=f_number(num_edge)
  f_value(edge_dist)
  ...etc

I thought about putting the functions in a dict and then create a list 
with the proper order, but can't get it to work.


A second question, all this works for small files with hundreds of 
lines, but some have 10. Then I can get at max 22 lists with 10 
items. Not fun. I tried writing the data to a file "out of sequence", 
not fun either. What would be the way to do this?
I thought about writing each data chunk to a proper temporary file 
instead of putting it in a list first. This would require at max 22 temp 
files and then a merge of the files into one.


TIA,

ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor